Sorry for the late reply, see below for why :) . On Friday 03 February 2017 23:44:10 Kai Krakow wrote: > Am Thu, 02 Feb 2017 13:01:03 +0100 > > schrieb Marc Joliet <mar...@gmx.de>: [...] > > > Btrfs --repair refused to repair the filesystem telling me something > > > about compressed extents and an unsupported case, wanting me to > > > take an image and send it to the devs. *sigh* > > > > I haven't tried a repair yet; it's a big file system, and btrfs-check > > is still running: > > > > # btrfs check -p /dev/sdd2 > > Checking filesystem on /dev/sdd2 > > UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 > > parent transid verify failed on 3829276291072 wanted 224274 found > > 283858 parent transid verify failed on 3829276291072 wanted 224274 > > found 283858 parent transid verify failed on 3829276291072 wanted > > 224274 found 283858 parent transid verify failed on 3829276291072 > > wanted 224274 found 283858 Ignoring transid failure > > leaf parent key incorrect 3829276291072 > > bad block 3829276291072 > > > > ERROR: errors found in extent allocation tree or chunk allocation > > block group 4722282987520 has wrong amount of free space > > failed to load free space cache for block group 4722282987520 > > checking free space cache [O] > > root 32018 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32089 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32091 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32092 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32107 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32189 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32190 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32191 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32265 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32266 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32409 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32410 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32411 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32412 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32413 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32631 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32632 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32633 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32634 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32635 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32636 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32718 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32732 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > checking fs roots [o] > > > > I know that the "file extend discount" errors are fixable from my > > previous email to this ML, but what about the rest? From looking > > through the ML archives it seems that --repair won't be able to fix > > the transid failures. It seems that one person had success with the > > "usebackuproot" mount option, which I haven't tried yet.
Well, it turns out that the problem fixed itself. After check finished, I tried mounting the FS normally, which succeeded. Shortly after, btrfs-cleaner started removing deleted snapshots. About 40 minutes later I got the error again: Feb 02 22:49:14 thetick kernel: BTRFS: device label MARCEC_BACKUP devid 1 transid 283903 /dev/sdd2 Feb 02 22:49:19 thetick kernel: EXT4-fs (sdd1): mounted filesystem with ordered data mode. Opts: (null) Feb 03 00:18:52 thetick kernel: BTRFS info (device sdd2): use zlib compression Feb 03 00:18:52 thetick kernel: BTRFS info (device sdd2): disk space caching is enabled Feb 03 00:18:52 thetick kernel: BTRFS info (device sdd2): has skinny extents Feb 03 00:20:09 thetick kernel: BTRFS info (device sdd2): The free space cache file (3967375376384) is invalid. skip it Feb 03 01:05:58 thetick kernel: ------------[ cut here ]------------ Feb 03 01:05:58 thetick kernel: WARNING: CPU: 1 PID: 26544 at fs/btrfs/extent- tree.c:2967 btrfs_run_delayed_refs+0x26c/0x290 Feb 03 01:05:58 thetick kernel: BTRFS: Transaction aborted (error -17) Feb 03 01:05:58 thetick kernel: Modules linked in: dummy iptable_filter ip_tables x_tables joydev hid_logitech_hidpp hid_logitech_dj snd_hda_codec_hdmi snd_hda_codec_analog snd_hda_codec_generic uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core sn Feb 03 01:05:58 thetick kernel: ehci_pci ohci_hcd ehci_hcd Feb 03 01:05:58 thetick kernel: CPU: 1 PID: 26544 Comm: btrfs-cleaner Not tainted 4.9.7-gentoo #1 Feb 03 01:05:58 thetick kernel: Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 1701 10/30/2008 Feb 03 01:05:58 thetick kernel: 0000000000000000 ffffffff8137af83 ffffc90000a4fd30 0000000000000000 Feb 03 01:05:58 thetick kernel: ffffffff810537ad ffff880138b2ec80 ffffc90000a4fd80 ffff880138b4e850 Feb 03 01:05:58 thetick kernel: 00000000ffffffef 00000000000000f2 ffff880138b2ec80 ffffffff8105381a Feb 03 01:05:58 thetick kernel: Call Trace: Feb 03 01:05:58 thetick kernel: [<ffffffff8137af83>] ? dump_stack+0x46/0x63 Feb 03 01:05:58 thetick kernel: [<ffffffff810537ad>] ? __warn+0xed/0x110 Feb 03 01:05:58 thetick kernel: [<ffffffff8105381a>] ? warn_slowpath_fmt+0x4a/0x50 Feb 03 01:05:58 thetick kernel: [<ffffffff81286f7a>] ? __btrfs_run_delayed_refs+0x11ba/0x1390 Feb 03 01:05:58 thetick kernel: [<ffffffff8128a2ec>] ? btrfs_run_delayed_refs+0x26c/0x290 Feb 03 01:05:58 thetick kernel: [<ffffffff81287fbe>] ? walk_up_tree+0xde/0x1d0 Feb 03 01:05:58 thetick kernel: [<ffffffff8129f6d5>] ? btrfs_should_end_transaction+0x45/0x80 Feb 03 01:05:58 thetick kernel: [<ffffffff81288851>] ? btrfs_drop_snapshot+0x401/0x7a0 Feb 03 01:05:58 thetick kernel: [<ffffffff812a09f6>] ? btrfs_clean_one_deleted_snapshot+0xb6/0xf0 Feb 03 01:05:58 thetick kernel: [<ffffffff8129757f>] ? cleaner_kthread+0x13f/0x1b0 Feb 03 01:05:58 thetick kernel: [<ffffffff81297440>] ? btrfs_destroy_pinned_extent+0xc0/0xc0 Feb 03 01:05:58 thetick kernel: [<ffffffff81070ee9>] ? kthread+0xb9/0xd0 Feb 03 01:05:58 thetick kernel: [<ffffffff81070e30>] ? kthread_park+0x70/0x70 Feb 03 01:05:58 thetick kernel: [<ffffffff8160dfd2>] ? ret_from_fork+0x22/0x30 Feb 03 01:05:58 thetick kernel: ---[ end trace 9b83fb0ab757cd99 ]--- Feb 03 01:05:58 thetick kernel: BTRFS: error (device sdd2) in btrfs_run_delayed_refs:2967: errno=-17 Object already exists Feb 03 01:05:58 thetick kernel: BTRFS info (device sdd2): forced readonly Feb 03 01:26:39 thetick kernel: BTRFS error (device sdd2): cleaner transaction attach returned -30 Feb 03 01:26:59 thetick kernel: BTRFS info (device sdd2): use zlib compression Feb 03 01:26:59 thetick kernel: BTRFS info (device sdd2): disk space caching is enabled Feb 03 01:26:59 thetick kernel: BTRFS info (device sdd2): has skinny extents Feb 03 01:27:44 thetick kernel: BTRFS info (device sdd2): The free space cache file (4690842484736) is invalid. skip it I then mounted it again, after which btrfs-cleaner resumed its task: Feb 03 09:54:05 thetick kernel: BTRFS info (device sdd2): use zlib compression Feb 03 09:54:05 thetick kernel: BTRFS info (device sdd2): disk space caching is enabled Feb 03 09:54:05 thetick kernel: BTRFS info (device sdd2): has skinny extents This time it finished successfully, after that there are no more messages pertaining to that file system in dmesg. I then ran btrfs-check again, which didn't find the transid errors any more: # btrfs check -p /dev/sdd2 Checking filesystem on /dev/sdd2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 checking extents [o] block group 4722282987520 has wrong amount of free space failed to load free space cache for block group 4722282987520 checking free space cache [o] root 32018 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32089 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32091 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32092 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32107 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32189 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32190 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32191 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32265 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32266 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32409 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32410 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32411 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32412 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32413 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32631 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32632 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32633 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32634 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32635 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32636 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32718 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32732 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32830 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32832 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32833 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32834 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32903 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 32961 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 33002 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 checking fs roots [O] found 901576467411 bytes used err is 1 total csum bytes: 871524676 total tree bytes: 11151159296 total fs tree bytes: 8709533696 total extent tree bytes: 1307844608 btree space waste bytes: 2428036988 file data blocks allocated: 17604037283840 referenced 1003776770048 So, since the error was now limited to something btrfs-check could repair, I ran check with --repair: # btrfs check -p --repair /dev/sdd2 enabling repair mode Checking filesystem on /dev/sdd2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 checking extents [O]o] Fixed 0 roots. cache and super generation don't match, space cache will be invalidated Fixed discount file extents for inode: 95066 in root: 32018 Fixed discount file extents for inode: 95066 in root: 32089 Fixed discount file extents for inode: 95066 in root: 32091 Fixed discount file extents for inode: 95066 in root: 32092 Fixed discount file extents for inode: 95066 in root: 32107 Fixed discount file extents for inode: 95066 in root: 32189 Fixed discount file extents for inode: 95066 in root: 32190 Fixed discount file extents for inode: 95066 in root: 32191 Fixed discount file extents for inode: 95066 in root: 32265 Fixed discount file extents for inode: 95066 in root: 32266 Fixed discount file extents for inode: 95066 in root: 32409 Fixed discount file extents for inode: 95066 in root: 32410 Fixed discount file extents for inode: 95066 in root: 32411 Fixed discount file extents for inode: 95066 in root: 32412 Fixed discount file extents for inode: 95066 in root: 32413 Fixed discount file extents for inode: 95066 in root: 32631 Fixed discount file extents for inode: 95066 in root: 32632 Fixed discount file extents for inode: 95066 in root: 32633 Fixed discount file extents for inode: 95066 in root: 32634 Fixed discount file extents for inode: 95066 in root: 32635 Fixed discount file extents for inode: 95066 in root: 32636 Fixed discount file extents for inode: 95066 in root: 32718 Fixed discount file extents for inode: 95066 in root: 32732 Fixed discount file extents for inode: 95066 in root: 32830ts [O] Fixed discount file extents for inode: 95066 in root: 32832 Fixed discount file extents for inode: 95066 in root: 32833 btrfs unable to find ref byte nr 4607605415936 parent 0 root 32834 owner 0 offset 1 Fixed discount file extents for inode: 95066 in root: 32834 Fixed discount file extents for inode: 95066 in root: 32903 Fixed discount file extents for inode: 95066 in root: 32961 Fixed discount file extents for inode: 95066 in root: 33002 warning line 4112 [.] checking csums checking root refs found 901576467411 bytes used err is 0 total csum bytes: 871524676 total tree bytes: 11151159296 total fs tree bytes: 8709533696 total extent tree bytes: 1307844608 btree space waste bytes: 2428036988 file data blocks allocated: 17604037283840 referenced 1003776770048 This, BTW, *just* completed a few hours ago, i.e., it took almost a full week (it started on Friday, February 3rd, at about 20:53:35, and finished today at 20:42:48). Since it required between about 75% and 85% of the measly 4GB of RAM my desktop has, I decided not to use it for the duration of the --repair run, which is why I haven't replied until now (since that's the machine I use for private email). Running check without --repair, on the other hand, took only 9-10 hours. Right now I have a scrub running, the status of which is, at the moment of writing: # btrfs scrub status /media/MARCEC_BACKUP scrub status for f97b3cda-15e8-418b-bb9b-235391ef2a38 scrub started at Fri Feb 10 21:00:36 2017, running for 02:14:05 total bytes scrubbed: 631.51GiB with 0 errors (which is about 2x slower than expected so far) Otherwise, since I have the FS mounted now, I can now show this: # btrfs filesystem show /media/MARCEC_BACKUP/ Label: 'MARCEC_BACKUP' uuid: f97b3cda-15e8-418b-bb9b-235391ef2a38 Total devices 1 FS bytes used 841.75GiB devid 1 size 976.56GiB used 878.25GiB path /dev/sdd2 # btrfs filesystem usage /media/MARCEC_BACKUP/ Overall: Device size: 976.56GiB Device allocated: 878.25GiB Device unallocated: 98.31GiB Device missing: 0.00B Used: 852.13GiB Free (estimated): 117.95GiB (min: 68.80GiB) Data ratio: 1.00 Metadata ratio: 1.92 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:851.00GiB, Used:831.36GiB /dev/sdd2 851.00GiB Metadata,single: Size:1.12GiB, Used:0.00B /dev/sdd2 1.12GiB Metadata,DUP: Size:13.00GiB, Used:10.38GiB /dev/sdd2 26.00GiB System,DUP: Size:64.00MiB, Used:120.00KiB /dev/sdd2 128.00MiB Unallocated: /dev/sdd2 98.31GiB # btrfs filesystem df /media/MARCEC_BACKUP/ Data, single: total=851.00GiB, used=831.36GiB System, DUP: total=64.00MiB, used=120.00KiB Metadata, DUP: total=13.00GiB, used=10.38GiB Metadata, single: total=1.12GiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B Hmm, I take it that the single metadata is a leftover from running --repair? [...] > > The system the drive runs on is: > > > > % uname -a > > Linux diefledermaus 4.9.7-gentoo #1 SMP Wed Feb 1 23:52:56 CET 2017 > > x86_64 Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz GenuineIntel GNU/Linux > > > > However during the crash it was running 4.9.6-gentoo. The system > > uses the standard CFQ scheduler, so perhaps BFQ is not at fault in > > Kai's case. > > > > The system I am running btrfs-check on is: > > > > % uname -a > > Linux thetick 4.9.6-gentoo #1 SMP PREEMPT Fri Jan 27 00:50:02 CET > > 2017 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ > > AuthenticAMD GNU/Linux > > > > Both have btrfs-progs 4.9: > > > > % /sbin/btrfs --version > > btrfs-progs v4.9 > > > > And the file system in question: > > > > % sudo /sbin/btrfs fi show /dev/sdd2 > > Label: 'MARCEC_BACKUP' uuid: > > f97b3cda-15e8-418b-bb9b-235391ef2a38 Total devices 1 FS bytes used > > 842.50GiB devid 1 size 976.56GiB used 877.31GiB path /dev/sdd2 > > > > The file system is mounted with > > "noatime,compress,comment=systemd.automount". > > So our common denominator is only "compress". Well, and systemd and > Gentoo - but that probably doesn't count. Yeah, probably not :) . > It looks like you're running without bcache and on a single disk. Exactly. > Do you use files that should maybe better be marked nocow? (chattr > +C)... in terms like VirtualBox images or database files? Nope. It's used exclusively as a target for btrfs-receive and rsync (OK, my music backups are also used by MPD on the home server it's normally attached to, as an alternative to leaving my desktop running just to hear music, but that's it). [...] Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup
signature.asc
Description: This is a digitally signed message part.