On 2018/12/31 下午11:52, Larkin Lowrey wrote: > On 10/11/2018 12:15 AM, Chris Murphy wrote: >> Is this a 68T file system? Seems excessive. >> Haha, by excessive I mean nuking such a big fs just for being unable >> to remove the space tree. I'm quite sure the devs would like to get >> that crashing bug fixed, anyway. > > A second FS just started failing. I never had this much trouble with > space cache v1. > > This host had a DIMM failure a couple of weeks ago which caused the > system to halt due to uncorrectable ECC error(s).
That looks like a pretty possible cause for the corruption. Like strange items in your extent tree of your other fs, if your memory is unreliable, all your fs is possible corrupted. And for the victim of memory corruption, the hotter tree block the easier to be a victim. For both case, the corruption happens at extent tree, which matches the symptom. Please do a btrfs check on all your filesystems. Thanks, Qu > That was the only > recent unsafe shutdown. Other than that, things have been running > normally until today when the FS went read-only during backups. As with > the other host, I tried to clear the space-cache (v2) before doing a > 'check --repair' but got this: > > [root@fubar ~]# btrfs check --clear-space-cache=v2 /dev/Cached/Nearline > Opening filesystem to check... > Checking filesystem on /dev/Cached/Nearline > UUID: 68d31d5f-97a2-4a73-a398-c7c13ff439a5 > Clear free space cache v2 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84 > bad tree block 271262429573120, bytenr mismatch, want=271262429573120, > have=17478763091281320157 > extent-tree.c:2703: alloc_reserved_tree_block: BUG_ON `ret` triggered, > value -17 > btrfs(+0x1ff96)[0x55eae7dc5f96] > btrfs(+0x2109f)[0x55eae7dc709f] > btrfs(+0x2115e)[0x55eae7dc715e] > btrfs(+0x22054)[0x55eae7dc8054] > btrfs(+0x22c57)[0x55eae7dc8c57] > btrfs(btrfs_alloc_free_block+0xc2)[0x55eae7dcca72] > btrfs(__btrfs_cow_block+0x18a)[0x55eae7dbc05a] > btrfs(btrfs_cow_block+0x104)[0x55eae7dbc874] > btrfs(btrfs_search_slot+0x35f)[0x55eae7dbf6cf] > btrfs(btrfs_clear_free_space_tree+0x104)[0x55eae7de8b54] > btrfs(cmd_check+0xb11)[0x55eae7e0ce31] > btrfs(main+0x88)[0x55eae7dbaaa8] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7fead8094413] > btrfs(_start+0x2e)[0x55eae7dbabbe] > Aborted (core dumped) > > # btrfs fi show /public/nearline/ > Label: none uuid: 68d31d5f-97a2-4a73-a398-c7c13ff439a5 > Total devices 1 FS bytes used 61.09TiB > devid 1 size 65.25TiB used 61.45TiB path > /dev/mapper/Cached-Nearline > > # btrfs fi df /public/nearline/ > Data, single: total=61.39TiB, used=61.03TiB > System, single: total=32.00MiB, used=6.59MiB > Metadata, single: total=67.00GiB, used=65.85GiB > GlobalReserve, single: total=512.00MiB, used=4.02MiB > > # btrfs fi usage /public/nearline/ > Overall: > Device size: 65.25TiB > Device allocated: 61.45TiB > Device unallocated: 3.79TiB > Device missing: 0.00B > Used: 61.09TiB > Free (estimated): 4.15TiB (min: 4.15TiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 512.00MiB (used: 4.02MiB) > > Data,single: Size:61.39TiB, Used:61.03TiB > /dev/mapper/Cached-Nearline 61.39TiB > > Metadata,single: Size:67.00GiB, Used:65.85GiB > /dev/mapper/Cached-Nearline 67.00GiB > > System,single: Size:32.00MiB, Used:6.59MiB > /dev/mapper/Cached-Nearline 32.00MiB > > Unallocated: > /dev/mapper/Cached-Nearline 3.79TiB > > 4.19.10-300.fc29.x86_64 > btrfs-progs v4.17.1 > > I haven't nuked the other FS yet so I now have two that are either in > the same or at least very similar states. > > What additional information can I provide? > > --Larkin
signature.asc
Description: OpenPGP digital signature