Re: btrfs send fail and check hang

2016-01-02 Thread Alistair Grant
On Sat, Jan 02, 2016 at 03:46:30PM +1100, Alistair Grant wrote:
> 
> When trying to send a snapshot I'm now getting errors such as:
> 
> ERROR: failed to open 
> backups/xps13/@home/@home.20151229_13:43:09/alistair/.mozilla/firefox/yu3bxg7y.default/cookies.sqlite.
>  No such file or directory
>
> [snip...]
> 
> General system information:
> 
> uname -a
> Linux alarmpi 4.1.15-1-ARCH #1 SMP Tue Dec 15 18:39:32 MST 2015 armv7l
> GNU/Linux
> 
> 
> btrfs --version
> btrfs-progs v4.3.1
> 
> 
> 
> mount | grep btrfs
> /dev/sda on /srv/d2root type btrfs 
> (rw,noatime,compress-force=zlib,space_cache)
> 
> 
> > sudo btrfs fi show /srv/d2root
> Label: 'data2'  uuid: d8daaa62-afa2-4654-b7de-22fdc8456e03
>   Total devices 2 FS bytes used 117.34GiB
>   devid1 size 1.82TiB used 118.03GiB path /dev/sda
>   devid2 size 1.82TiB used 118.03GiB path /dev/sdb
> 
> 
> 
> > sudo btrfs fi df /srv/d2root
> Data, RAID1: total=117.00GiB, used=116.76GiB
> System, RAID1: total=32.00MiB, used=48.00KiB
> Metadata, RAID1: total=1.00GiB, used=595.36MiB
> GlobalReserve, single: total=208.00MiB, used=0.00B
> 
> 
> > sudo btrfs fi usage /srv/d2root
> Overall:
> Device size: 3.64TiB
> Device allocated:  236.06GiB
> Device unallocated:  3.41TiB
> Device missing:0.00B
> Used:  234.68GiB
> Free (estimated):1.70TiB  (min: 1.70TiB)
> Data ratio: 2.00
> Metadata ratio: 2.00
> Global reserve:208.00MiB  (used: 0.00B)
> 
> Data,RAID1: Size:117.00GiB, Used:116.76GiB
>/dev/sda117.00GiB
>/dev/sdb117.00GiB
> 
> Metadata,RAID1: Size:1.00GiB, Used:595.36MiB
>/dev/sda  1.00GiB
>/dev/sdb  1.00GiB
> 
> System,RAID1: Size:32.00MiB, Used:48.00KiB
>/dev/sda 32.00MiB
>/dev/sdb 32.00MiB
> 
> Unallocated:
>/dev/sda  1.70TiB
>/dev/sdb  1.70TiB


I've figured out a workaround for the errors, but I don't understand why
the workaround is needed.

The error that I was getting from the btrfs receive process was:

ERROR: failed to open 
backups/xps13/@home/@home.20151229_07:57:44/alistair/.mozilla/firefox/yu3bxg7y.default/cookies.sqlite.
  No such file or directory

It can be avoided by changing the receive command (which has worked fine
until now) from:

btrfs receive /srv/d2backups/xps13/@home

to:

btrfs receive /srv/d2root/backups/xps13/@home

(not shown in the commands is that while testing I wrote the output from
btrfs send to a file and manually copied it across to the destination
machine.  Normally it is piped through ssh.)

These are the same directory:

(As an added complication, there appears to be a bug somewhere in
4.1.15-1-ARCH as the mount command isn't displaying the subvolume)

> mount | grep btrfs
/dev/sda on /srv/d2backups type btrfs 
(rw,noatime,compress-force=zlib,space_cache)
/dev/sda on /srv/d2root type btrfs (rw,noatime,compress-force=zlib,space_cache)

The original mount commands were:

> sudo mount -t btrfs -o compress-force=zlib,noatime,subvol=backups
> LABEL=data2 /srv/d2backups
> sudo mount -t btrfs -o compress-force=zlib,noatime LABEL=data2
> /srv/d2root

And can be confirmed by:

> ls /srv/d2root
backups/  snapshots/
> ls /srv/d2root/backups
alistair-srv/  xps13/
> ls /srv/d2backups
alistair-srv/  xps13/

Version information about the destination machine is in my original
message below.

In case information about the source machine is useful:

> uname -a
Linux alistair-xps13 4.2.0-22-generic #27-Ubuntu SMP Thu Dec 17 22:57:08
UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

> btrfs --version
btrfs-progs v4.0

The send command was:

sudo btrfs send -p '@home.20151229_07:57:44' -c '@home.20151229_07:57:44' 
'@home.20160103_13:16:29'

> mount | grep btrfs
/dev/sda4 on /home type btrfs 
(rw,noatime,compress=zlib,ssd,space_cache,autodefrag,subvolid=257,subvol=/@home)
/dev/sda4 on /srv/home type btrfs 
(rw,noatime,compress=zlib,ssd,space_cache,autodefrag,subvolid=5,subvol=/)

I don't believe there have been any recent updates to the kernel or
btrfs progs on either machine.

If you would like any more information, please let me know.

Thanks,
Alistair


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs send fail and check hang

2016-01-01 Thread Alistair Grant
Hi,

When trying to send a snapshot I'm now getting errors such as:

ERROR: failed to open 
backups/xps13/@home/@home.20151229_13:43:09/alistair/.mozilla/firefox/yu3bxg7y.default/cookies.sqlite.
 No such file or directory

and

ERROR: could not find parent subvolume

This script has been running without a problem for several weeks.

I can reboot the system and the filesystem mounts without a problem.  I
can also navigate through the existing snapshots and access files
without any problem (these are all read-only snapshots, so I'm not
attempting to write anything).  There are no obvious errors in the
system log (I checked the log manually, and also have Marc Merlin's sec
script running to monitor for errors).

I tried running a read-only btrfs check, however it is hanging while
checking fs roots:

> sudo umount /srv/d2root
> sudo btrfs check /dev/sda
checking filesystem on /dev/sda
UUID: d8daaa62-afa2-4654-b7de-22fdc8456e03
checking extents
checking free space cache
checking fs roots
^C

Disk IO was several MB/s during the initial part of the check and
dropped to 0 on checking fs roots.  I left it for about 10 minutes
before interrupting.

The same happens for /dev/sdb.

General system information:

uname -a
Linux alarmpi 4.1.15-1-ARCH #1 SMP Tue Dec 15 18:39:32 MST 2015 armv7l
GNU/Linux


btrfs --version
btrfs-progs v4.3.1



mount | grep btrfs
/dev/sda on /srv/d2root type btrfs (rw,noatime,compress-force=zlib,space_cache)


> sudo btrfs fi show /srv/d2root
Label: 'data2'  uuid: d8daaa62-afa2-4654-b7de-22fdc8456e03
Total devices 2 FS bytes used 117.34GiB
devid1 size 1.82TiB used 118.03GiB path /dev/sda
devid2 size 1.82TiB used 118.03GiB path /dev/sdb



> sudo btrfs fi df /srv/d2root
Data, RAID1: total=117.00GiB, used=116.76GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=1.00GiB, used=595.36MiB
GlobalReserve, single: total=208.00MiB, used=0.00B


> sudo btrfs fi usage /srv/d2root
Overall:
Device size:   3.64TiB
Device allocated:236.06GiB
Device unallocated:3.41TiB
Device missing:  0.00B
Used:234.68GiB
Free (estimated):  1.70TiB  (min: 1.70TiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  208.00MiB  (used: 0.00B)

Data,RAID1: Size:117.00GiB, Used:116.76GiB
   /dev/sda  117.00GiB
   /dev/sdb  117.00GiB

Metadata,RAID1: Size:1.00GiB, Used:595.36MiB
   /dev/sda1.00GiB
   /dev/sdb1.00GiB

System,RAID1: Size:32.00MiB, Used:48.00KiB
   /dev/sda   32.00MiB
   /dev/sdb   32.00MiB

Unallocated:
   /dev/sda1.70TiB
   /dev/sdb1.70TiB


Help!

Many Thanks,
Alistair

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Trouble with broken RAID5/6 System after trying to solve a problem, want to recover contained Data

2015-12-29 Thread Alistair Grant
On Tue, Dec 29, 2015 at 7:16 AM, Christian  wrote:
> I found out, that i can also show more Information about Files and Diretories 
> contained in the Filesystem.
>
> btrfs-debug-tree show me the following Infos:
>
> parent transid verify failed on 2234958286848 wanted 35674 found 35675
> parent transid verify failed on 2234958286848 wanted 35674 found 35675
> ...
>
>
> Does someone know how to recover the files from the Filesystem?
>
> Kind regards,
> Christian

The first place to look is: https://btrfs.wiki.kernel.org/index.php/Restore

If you search the list history, Duncan has also written several
informative responses.  The one I have bookmarked is:

* https://mail-archive.com/linux-btrfs@vger.kernel.org/msg49181.html
  - my original cry for help
* https://mail-archive.com/linux-btrfs@vger.kernel.org/msg49282.html
  - this is a short cut to the main description, but the whole thread
should probably be read.

Cheers,
Alistair
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fixing recursive fault and parent transid verify failed

2015-12-12 Thread Alistair Grant
On Wed, Dec 09, 2015 at 10:19:41AM +, Duncan wrote:
> Alistair Grant posted on Wed, 09 Dec 2015 09:38:47 +1100 as excerpted:
> 
> > On Tue, Dec 08, 2015 at 03:25:14PM +, Duncan wrote:
> > Thanks again Duncan for your assistance.
> > 
> > I plugged the ext4 drive I planned to use for the recovery in to the
> > machine and immediately got a couple of errors, which makes me wonder
> > whether there isn't a hardware problem with the machine somewhere.
> > 
> > So decided to move to another machine to do the recovery.
> 
> Ouch!  That can happen, and if you moved the ext4 drive to a different 
> machine and it was fine there, then it's not the drive.
> 
> But you didn't say what kind of errors or if you checked SMART, or even 
> how it was plugged in (USB or SATA-direct or...).  So I guess you have 
> that side of things under control.  (If not, there's some here who know 
> quite a bit about that sort of thing...)

Yep, I'm familiar enough with smartmontools, etc. to (hopefully) figure
this out on my own.


> 
> > So I'm now recovering on Arch Linux 4.1.13-1 with btrfs-progs v4.3.1
> > (the latest version from archlinuxarm.org).
> > 
> > Attempting:
> > 
> > sudo btrfs restore -S -m -v /dev/sdb /mnt/btrfs-recover/ ^&1 | tee
> > btrfs-recover.log
> > 
> > only recovered 53 of the more than 106,000 files that should be
> > available.
> > 
> > The log is available at:
> > 
> > https://www.dropbox.com/s/p8bi6b8b27s9mhv/btrfs-recover.log?dl=0
> > 
> > I did attempt btrfs-find-root, but couldn't make sense of the output:
> > 
> > https://www.dropbox.com/s/qm3h2f7c6puvd4j/btrfs-find-root.log?dl=0
> 
> Yeah, btrfs-find-root's output deciphering takes a bit of knowledge.  
> Between what I had said and the wiki, I was hoping you could make sense 
> of things without further help, but...
>
> ...

It turns out that a drive from a separate filesystem was dying and
causing all the weird behaviour on the original machine.

Having two failures at the same time (drive physical failure and btrfs
filesystem corruption) was a bit too much for me, so I aborted the btrfs
restore attempts, bought a replacement drive and just went back to the
backups (for both failures).

Unfortunately, I now won't be able to determine whether there was any
connection between the failures or not.

So while I didn't get to practice my restore skills, the good news is
that it is all back up and running without any problems (yet :-)).

Thank you very much for the description and detailed set of steps for
using btrfs-find-root and restore.  While I didn't get to use them this
time, I've added links to the mailing list archive in my btrfs wiki user
page so I can find my way back (and if others search for restore and
find root they may also benefit from your effort).

Thanks again,
Alistair

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fixing recursive fault and parent transid verify failed

2015-12-08 Thread Alistair Grant
On Tue, Dec 08, 2015 at 03:25:14PM +, Duncan wrote:
> Alistair Grant posted on Tue, 08 Dec 2015 06:55:04 +1100 as excerpted:
> 
> > On Mon, Dec 07, 2015 at 01:48:47PM +, Duncan wrote:
> >> Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as excerpted:
> >> 
> >> > I think I'll try the btrfs restore as a learning exercise, and to
> >> > check the contents of my backup (I don't trust my memory, so
> >> > something could have changed since the last backup).
> >> 
> >> Trying btrfs restore is an excellent idea.  It'll make things far
> >> easier if you have to use it for real some day.
> >> 
> >> Note that while I see your kernel is reasonably current (4.2 series), I
> >> don't know what btrfs-progs ubuntu ships.  There have been some marked
> >> improvements to restore somewhat recently, checking the wiki
> >> btrfs-progs release-changelog list says 4.0 brought optional metadata
> >> restore, 4.0.1 added --symlinks, and 4.2.3 fixed a symlink path check
> >> off-by-one error. (And don't use 4.1.1 as its mkfs.btrfs is broken and
> >> produces invalid filesystems.)  So you'll want at least progs 4.0 to
> >> get the optional metadata restoration, and 4.2.3 to get full symlinks
> >> restoration support.
> >> 
> >> ...


Thanks again Duncan for your assistance.

I plugged the ext4 drive I planned to use for the recovery in to the
machine and immediately got a couple of errors, which makes me wonder
whether there isn't a hardware problem with the machine somewhere.  So
decided to move to another machine to do the recovery.

So I'm now recovering on Arch Linux 4.1.13-1 with btrfs-progs v4.3.1
(the latest version from archlinuxarm.org).

Attempting:

sudo btrfs restore -S -m -v /dev/sdb /mnt/btrfs-recover/ ^&1 | tee 
btrfs-recover.log

only recovered 53 of the more than 106,000 files that should be available.

The log is available at: 

https://www.dropbox.com/s/p8bi6b8b27s9mhv/btrfs-recover.log?dl=0

I did attempt btrfs-find-root, but couldn't make sense of the output:

https://www.dropbox.com/s/qm3h2f7c6puvd4j/btrfs-find-root.log?dl=0

Simply mounting the drive, then re-mounting it read only, and rsync'ing
the files to the backup drive recovered 97,974 files before crashing.
If anyone is interested, I've uploaded a photo of the console to:

https://www.dropbox.com/s/xbrp6hiah9y6i7s/rsync%20crash.jpg?dl=0

I'm currently running a hashdeep audit between the recovered files and
the backup to see how the recovery went.

If you'd like me to try any other tests, I'll keep the damaged file
system for at least the next day or so.

Thanks again for all your assistance,
Alistair

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fixing recursive fault and parent transid verify failed

2015-12-07 Thread Alistair Grant
On Mon, Dec 07, 2015 at 08:25:01AM +, Duncan wrote:
> Alistair Grant posted on Mon, 07 Dec 2015 12:57:15 +1100 as excerpted:
> 
> > I've ran btrfs scrub and btrfsck on the drives, with the output included
> > below.  Based on what I've found on the web, I assume that a
> > btrfs-zero-log is required.
> > 
> > * Is this the recommended path?
> 
> [Just replying to a couple more minor points, here.]
> 
> Absolutely not.  btrfs-zero-log isn't the tool you need here.
> 
> About the btrfs log...
> 
> Unlike most journaling filesystems, btrfs is designed to be atomic and 
> consistent at commit time (every 30 seconds by default) and doesn't log 
> normal filesystem activity at all.  The only thing logged is fsyncs, 
> allowing them to deliver on their file-written-to-hardware guarantees, 
> without forcing the entire atomic filesystem sync, which would trigger a 
> normal atomic commit and thus is a far heavier weight process.  IOW, all 
> it does is log and speedup fsyncs.  The filesystem is designed to be 
> atomically consistent at commit time, with or without the log, with the 
> only thing missing if the log isn't replayed being the last few seconds 
> of fsyncs since the last atomic commit.
> 
> So the btrfs log is very limited in scope and will in many cases be 
> entirely empty, if there were no fsyncs after the last atomic filesystem 
> commit, again, every 30 seconds by default, so in human terms at least, 
> not a lot of time.
> 
> About btrfs log replay...
> 
> The kernel, meanwhile, is designed to replay the log automatically at 
> mount time.  If the mount is successful, the log has by definition been 
> replayed successfully and zeroing it wouldn't have done much of anything 
> but possibly lose you a few seconds worth of fsyncs.
> 
> Since you are able to run scrub, which requires a writable mount, the 
> mount is definitely successful, which means btrfs-zero-log is the wrong 
> tool for the job, since it addresses a problem you obviously don't have.

OK, thanks for the detailed explanation (here and below, so I don't have
to repeat myself).

The reason I thought it might be required was that the parent transid
failed errors were found even after a reboot (and obviously remounting
the filesystem) and without any user activity.

> 
> > * Is there a way to find out which files will be affected by the loss of
> >   the transactions?
> 
> I'm interpreting that question in the context of the transid wanted/found 
> listings in your linked logs, since it no longer makes sense in the 
> context of btrfs-zero-log, given the information above.
> 
> I believe so, but the most direct method requires manual use of btrfs-
> debug and similar tools, looking up addresses and tracing down the files 
> to which they belong.  Of course that's if the addresses trace to actual 
> files at all.  If they trace to metadata instead of data, then it's not 
> normally files, but the metadata (including checksums and very small 
> files of only a few KiB) about files, instead.  Of course if it's 
> metadata the problem's worse, as a single bad metadata block can affect 
> multiple actual files.
> 
> The more indirect way would be to use btrfs restore with the -t option, 
> feeding it the root address associated with the transid found (with that 
> association traced via btrfs-find-root), to restore the file from the 
> filesystem as it existed at that point, to some other mounted filesystem, 
> also using the restore metadata option.  You could then do for instance a 
> diff of the listing (or possibly a per-file checksum, say md5sum, of both 
> versions) between your current backup (or current mounted filesystem, 
> since you can still mount it) and the restored version, which would be 
> the files at the time of that transaction-id, and see which ones 
> changed.  That of course would be the affected files. =:^]
> 

I think I'll try the btrfs restore as a learning exercise, and to check
the contents of my backup (I don't trust my memory, so something could
have changed since the last backup).

Does btrfs restore require the path to be on a btrfs filesystem?  I've
got an existing ext4 drive with enough free space to do the restore, so
would prefer to use it than have to buy another drive.

My plan is:

* btrfs restore /dev/sdX /path/to/ext4/restorepoint
** Where /dev/sdX is one of the two drives that were part of the raid1
   fileystem
* hashdeep audit the restored drive and backup
* delete the existing corrupted btrfs filesystem and recreate
* rsync the merge filesystem (from backup and restore) on to the new
  filesystem

Any comments or suggestions are welcome.


> > I do have a backup of the drive (which I believe is completely up to
> > date, the btrfs volume is used for ar

Re: Fixing recursive fault and parent transid verify failed

2015-12-07 Thread Alistair Grant
On Mon, Dec 07, 2015 at 01:48:47PM +, Duncan wrote:
> Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as excerpted:
> 
> > I think I'll try the btrfs restore as a learning exercise, and to check
> > the contents of my backup (I don't trust my memory, so something could
> > have changed since the last backup).
> 
> Trying btrfs restore is an excellent idea.  It'll make things far easier 
> if you have to use it for real some day.
> 
> Note that while I see your kernel is reasonably current (4.2 series), I 
> don't know what btrfs-progs ubuntu ships.  There have been some marked 
> improvements to restore somewhat recently, checking the wiki btrfs-progs 
> release-changelog list says 4.0 brought optional metadata restore, 4.0.1 
> added --symlinks, and 4.2.3 fixed a symlink path check off-by-one error.  
> (And don't use 4.1.1 as its mkfs.btrfs is broken and produces invalid 
> filesystems.)  So you'll want at least progs 4.0 to get the optional 
> metadata restoration, and 4.2.3 to get full symlinks restoration support.
> 

Ubuntu 15.10 comes with btrfs-progs v4.0.  It looks like it is easy
enough to compile and install the latest version from
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git so
I'll do that.

Should I stick to 4.2.3 or use the latest 4.3.1?


> > Does btrfs restore require the path to be on a btrfs filesystem?  I've
> > got an existing ext4 drive with enough free space to do the restore, so
> > would prefer to use it than have to buy another drive.
> 
> Restoring to ext4 should be fine.
> 
> Btrfs restore writes files as would an ordinary application, the reason 
> metadata restoration is optional (otherwise it uses normal file change 
> and mod times, with files written as the running user, root, using umask-
> based file perms, all exactly the same as if it were a normal file 
> writing application), so it will restore to any normal filesystem.  The 
> filesystem it's restoring /from/ of course must be btrfs... unmounted 
> since it's designed to be used when mounting is broken, but it writes 
> files normally, so can write them to any filesystem.
> 
> FWIW, I restored to my reiserfs based media partition (still on spinning 
> rust, my btrfs are all on ssd) here, since that's where I had the room to 
> work with.
>

Thanks for the confirmation.

 
> > My plan is:
> > 
> > * btrfs restore /dev/sdX /path/to/ext4/restorepoint
> > ** Where /dev/sdX is one of the two drives that were part of the raid1
> >fileystem
> > * hashdeep audit the restored drive and backup
> > * delete the existing corrupted btrfs filesystem and recreate
> > * rsync the merge filesystem (from backup and restore)
> >   on to the new filesystem
> > 
> > Any comments or suggestions are welcome.
> 
> 
> Looks very reasonable, here.  There's a restore page on the wiki with 
> more information than the btrfs-restore manpage, describing how to use it 
> with btrfs-find-root if necessary, etc.
> 
> https://btrfs.wiki.kernel.org/index.php/Restore
> 

I'd seen this, but it isn't explicit about the target filesystem
support.  I should try and update the page a bit.


> Some details on the page are a bit dated; it doesn't cover the dryrun, 
> list-roots, metadata and symlink options, for instance, and these can be 
> very helpful, but the general idea remains the same.
> 
> The general idea is to use btrfs-find-root to get a listing of available 
> root generations (if restore can't find a working root from the 
> superblocks or you want to try restoring an earlier root), then feed the 
> corresponding bytenr to restore's -t option.
> 
> Note that generation and transid refer to the same thing, a normally 
> increasing number, so higher generations are newer.  The wiki page makes 
> this much clearer than it used to, but the old wording anyway was 
> confusing to me until I figured that out.
> 
> Where the wiki page talks about root object-ids, those are the various 
> subtrees, low numbers are the base trees, 256+ are subvolumes/snapshots.  
> Note that restore's list-roots option lists these for the given bytenr as 
> well.
> 
> So you try restore with list-roots (-l) to see what it gives you, try 
> btrfs-find-root if not satisfied, to find older generations and get their 
> bytenrs to plug into restore with -t, and then confirm specific 
> generation bytenrs with list-roots again.
> 
> Once you have a good generation/bytenr candidate, try a dry-run (-D) to 
> see if you get a list of files it's trying to restore that looks 
> reasonable.
> 
> If the dry-run goes well, you can try the full restore, not forgetting 
> the metadata and symlinks options (-m, -S, respectively), if desired.

Fixing recursive fault and parent transid verify failed

2015-12-06 Thread Alistair Grant
Hi,

(Resending as it looks like the first attempt didn't get through,
probably too large, so logs are now in dropbox)

I have a btrfs volume which is raid1 across two spinning rust disks,
each 2TB.

When trying to access some files from a another machine using sshfs the
server machine has crashed twice resulting in a hard lock up, i.e. power
off required to restart the machine.

There are no crash dumps in /var/log/syslog, or anything that looks like
an associated error message to me, however on the second occasion I was
able to see the following message flash up the console (in addition to
some stack dumps):

Fixing recursive fault, but reboot is needed

I've ran btrfs scrub and btrfsck on the drives, with the output
included below.  Based on what I've found on the web, I assume that a
btrfs-zero-log is required.

* Is this the recommended path?
* Is there a way to find out which files will be affected by the loss of
  the transactions?

I do have a backup of the drive (which I believe is completely up to
date, the btrfs volume is used for archiving media and documents, and
single person use of git repositories, i.e. only very light writing and
reading).

Some basic details:

OS: Ubuntu 15.10
Kernel: Ubuntu 4.2.0-19-generic (which is based on mainline 4.2.6)

> sudo btrfs fi df /srv/d2root
==

Data, RAID1: total=250.00GiB, used=248.86GiB
Data, single: total=8.00MiB, used=0.00B
System, RAID1: total=8.00MiB, used=64.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, RAID1: total=1.00GiB, used=466.77MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=160.00MiB, used=0.00B

> sudo btrfs fi usage /srv/d2root
=

Overall:
Device size:   3.64TiB
Device allocated:502.04GiB
Device unallocated:3.15TiB
Device missing:  0.00B
Used:498.62GiB
Free (estimated):  1.58TiB  (min: 1.58TiB)
Data ratio:   2.00
Metadata ratio:   1.99
Global reserve:  160.00MiB  (used: 0.00B)

Data,single: Size:8.00MiB, Used:0.00B
   /dev/sdc8.00MiB

Data,RAID1: Size:250.00GiB, Used:248.86GiB
   /dev/sdb  250.00GiB
   /dev/sdc  250.00GiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sdc8.00MiB

Metadata,RAID1: Size:1.00GiB, Used:466.77MiB
   /dev/sdb1.00GiB
   /dev/sdc1.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sdc4.00MiB

System,RAID1: Size:8.00MiB, Used:64.00KiB
   /dev/sdb8.00MiB
   /dev/sdc8.00MiB

Unallocated:
   /dev/sdb1.57TiB
   /dev/sdc1.57TiB


btrfs scrub output:
https://www.dropbox.com/s/blqvopa1lhkghe5/scrub.log?dl=0


btrfsck sdb output:
https://www.dropbox.com/s/hw6w6cupuu1rny4/btrfsck.sdb.log?dl=0


btrfsck sdc output:
https://www.dropbox.com/s/mijz492mjr76p8z/btrfsck.sdc.log?dl=0



Thanks very much,
Alistair

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html