Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 04.04.2017 18:55, Chris Murphy wrote: > On Tue, Apr 4, 2017 at 10:52 AM, Chris Murphywrote: > > >> Mounting -o ro,degraded is probably permitted by the file system, but >> chunks of the file system and certainly your data, will be missing. So >> it's just a matter of time before copying data off will fail. > ** Context here is, more than 1 device missing. > Thanks you guys for all your help and input. I've ordered two new drives to backup all my data. I have a cloud backup in place, but 13TB takes a while to upload :-) I think I'm gonna abandon btrfs as the main fs for my home server. I'm just gonna set up a separate LVM volume for storing snapshots and backups, since I use btrfs on all my single disk machines. Thanks again everyone. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On Mon, Apr 3, 2017 at 10:02 PM, Robert Krigwrote: > > > On 03.04.2017 16:25, Robert Krig wrote: >> >> I'm gonna run a extensive memory check once I get home, since you >> mentioned corrupt memory might be an issue here. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > I ran a memtest over a couple of hours with no errors. Ram seems to be > fine so far. Inconclusive. A memtest can take days to expose a problem, and even that's not conclusive. The list archive has some examples of where memory testers gave RAM a pass, but doing things like compiling the kernel would fail. > > I've looked at the link you provided. Frankly it looks very scary. (At > least to me it does) > But I've just thought of something else. > > My storage array is BTRFS Raid1 with 4x8TB Drives. > Wouldn't it be possible to simply disconnect two of those drives, mount > with -o degraded and still have access (even if read-only) to all my data? man mkfs.btrfs Btrfs raid1 supports only one device missing, no matter how many drives. Mounting -o ro,degraded is probably permitted by the file system, but chunks of the file system and certainly your data, will be missing. So it's just a matter of time before copying data off will fail. I suggest trying -o ro with all drives, not a degraded mount, and copying data off. Any failures should be logged. Metadata errors are logged without paths, whereas data corruption included path to the affected file. This is easier than scraping the file system with btrfs restore. If you can't mount ro with all drives, or ro,degraded with just one device missing, you'll need to use btrfs restore which is more tolerant of missing metadata. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 2017-04-04 09:29, Brian B wrote: On 04/04/2017 12:02 AM, Robert Krig wrote: My storage array is BTRFS Raid1 with 4x8TB Drives. Wouldn't it be possible to simply disconnect two of those drives, mount with -o degraded and still have access (even if read-only) to all my data? Just jumping on this point: my understanding of BTRFS "RAID1" is that each file (block?) is randomly assigned to two disks of the array (no matter how many disks are in the array). So if you remove two disks, you will probably have files that were "assigned" to both of those disks, and will be missing. In short, you can't remove more than one disk of a BTRFS RAID1 and still have all of your data. That understanding is correct. From a functional perspective, BTRFS raid1 is currently a RAID10 implementation with striping happening at a very large granularity. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On Tue, Apr 04, 2017 at 09:29:11AM -0400, Brian B wrote: > On 04/04/2017 12:02 AM, Robert Krig wrote: > > My storage array is BTRFS Raid1 with 4x8TB Drives. > > Wouldn't it be possible to simply disconnect two of those drives, mount > > with -o degraded and still have access (even if read-only) to all my data? > Just jumping on this point: my understanding of BTRFS "RAID1" is that > each file (block?) is randomly assigned to two disks of the array (no Arbitrarily assigned, rather than randomly assigned (there is a deterministic algorithm for it, but it's wise not to rely on the exact behaviour of that algorithm, because there are a number of factors that can alter its behaviour). > matter how many disks are in the array). So if you remove two disks, > you will probably have files that were "assigned" to both of those > disks, and will be missing. > > In short, you can't remove more than one disk of a BTRFS RAID1 and still > have all of your data. Indeed. Hugo. -- Hugo Mills | Some days, it's just not worth gnawing through the hugo@... carfax.org.uk | straps http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 04/04/2017 12:02 AM, Robert Krig wrote: > My storage array is BTRFS Raid1 with 4x8TB Drives. > Wouldn't it be possible to simply disconnect two of those drives, mount > with -o degraded and still have access (even if read-only) to all my data? Just jumping on this point: my understanding of BTRFS "RAID1" is that each file (block?) is randomly assigned to two disks of the array (no matter how many disks are in the array). So if you remove two disks, you will probably have files that were "assigned" to both of those disks, and will be missing. In short, you can't remove more than one disk of a BTRFS RAID1 and still have all of your data. signature.asc Description: OpenPGP digital signature
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 03.04.2017 16:25, Robert Krig wrote: > > I'm gonna run a extensive memory check once I get home, since you > mentioned corrupt memory might be an issue here. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html I ran a memtest over a couple of hours with no errors. Ram seems to be fine so far. I've looked at the link you provided. Frankly it looks very scary. (At least to me it does) But I've just thought of something else. My storage array is BTRFS Raid1 with 4x8TB Drives. Wouldn't it be possible to simply disconnect two of those drives, mount with -o degraded and still have access (even if read-only) to all my data? E.g. I could use the two removed drives as a backup and rebuild my array from there. Since I'm kind of playing with the idea of turning it into a MD RAID5 and only use btrfs on specific lvm volumes which need it. The one thing that slightly worries me with this idea is, I don't know if there is a way to tell which datablocks are on which drives. If I've understood btrfs raid1 correctly it simply ensures that there is at least a copy of each block on a different device. Would my idea work? Or could it be that I can only safely remove one drive, since the other drives might contain blocks from any of the other drives? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 04/03/2017 04:20 PM, Robert Krig wrote: > > > On 03.04.2017 16:08, Hans van Kranenburg wrote: >> On 04/03/2017 12:11 PM, Robert Krig wrote: >> The corruption is at item 157. Can you attach all of the output, or >> pastebin it? >> > > I've attached the entire log of btrfs-debug-tree. This was generated > with btrfs-progs 4.7.3 Meuh, item 156 key (23416298414080 EXTENT_ITEM 4096) itemoff 8643 itemsize 53 item 157 key (23416298418176 EXTENT_ITEM 4096) itemoff 8590 itemsize 53 8590 + 53 = 8643. I don't get what's invalid about that. "incorrect offsets 8590 1258314415" if (btrfs_item_offset_nr(buf, i) != btrfs_item_end_nr(buf, i + 1)) { ret = BTRFS_TREE_BLOCK_INVALID_OFFSETS; fprintf(stderr, "incorrect offsets %u %u\n", btrfs_item_offset_nr(buf, i), btrfs_item_end_nr(buf, i + 1)); goto fail; } Ah, ok, so the corruption is in item 158, but it's reported as corruption in item 157. There's no really simple tool right now to fix this manually. We can also try to dd 16kiB of metadata from disk, fix it, and write it back. We've been doing that before, it's a bit of work, but it can succeed. Here's more instructions: https://www.spinics.net/lists/linux-btrfs/msg62459.html So, if you're the adventurous type... But then again, if this is really memory failure, there might be other errors all around the fs, which you didn't hit while reading back the data yet. Also note that btrfs does not protect you against this, also not for data in files that gets corrupted in memory before it's written out (which contains the checksum step). > If it makes a difference, I can try it again with the newest version of > btrfs-progs? No, that code hasn't been touched in over 5 years. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 03.04.2017 16:20, Robert Krig wrote: > > On 03.04.2017 16:08, Hans van Kranenburg wrote: >> On 04/03/2017 12:11 PM, Robert Krig wrote: >> The corruption is at item 157. Can you attach all of the output, or >> pastebin it? >> > > I've attached the entire log of btrfs-debug-tree. This was generated > with btrfs-progs 4.7.3 > > If it makes a difference, I can try it again with the newest version of > btrfs-progs? I forgot to mention that btrfs-debug-tree also segfaults with a "memory access error" I'm gonna run a extensive memory check once I get home, since you mentioned corrupt memory might be an issue here. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 03.04.2017 16:08, Hans van Kranenburg wrote: > On 04/03/2017 12:11 PM, Robert Krig wrote: > The corruption is at item 157. Can you attach all of the output, or > pastebin it? > I've attached the entire log of btrfs-debug-tree. This was generated with btrfs-progs 4.7.3 If it makes a difference, I can try it again with the newest version of btrfs-progs? btrfs-progs v4.7.3 leaf 38666170826752 items 199 free space 1506 generation 1248226 owner 2 fs uuid 8c4f8e26-3442-463f-ad8a-668dfef02593 chunk uuid 1f04f64e-0ec8-4b39-83d9-a2df75179d3e item 0 key (23416295448576 EXTENT_ITEM 36864) itemoff 16230 itemsize 53 extent refs 1 gen 671397 flags DATA extent data backref root 5 objectid 4959957 offset 0 count 1 item 1 key (23416295485440 EXTENT_ITEM 8192) itemoff 16177 itemsize 53 extent refs 1 gen 972749 flags DATA extent data backref root 5 objectid 7328099 offset 0 count 1 item 2 key (23416295493632 EXTENT_ITEM 12288) itemoff 16124 itemsize 53 extent refs 1 gen 797708 flags DATA extent data backref root 5 objectid 5842103 offset 1966080 count 1 item 3 key (23416295505920 EXTENT_ITEM 8192) itemoff 16071 itemsize 53 extent refs 1 gen 1244513 flags DATA extent data backref root 44107 objectid 28528 offset 974848 count 1 item 4 key (23416295514112 EXTENT_ITEM 8192) itemoff 16034 itemsize 37 extent refs 1 gen 625327 flags DATA shared data backref parent 38666872045568 count 1 item 5 key (23416295522304 EXTENT_ITEM 16384) itemoff 15997 itemsize 37 extent refs 1 gen 625327 flags DATA shared data backref parent 38666872045568 count 1 item 6 key (23416295538688 EXTENT_ITEM 49152) itemoff 15944 itemsize 53 extent refs 1 gen 585321 flags DATA extent data backref root 5 objectid 4742401 offset 393216 count 1 item 7 key (23416295587840 EXTENT_ITEM 8192) itemoff 15907 itemsize 37 extent refs 1 gen 625327 flags DATA shared data backref parent 38666872045568 count 1 item 8 key (23416295596032 EXTENT_ITEM 4096) itemoff 15854 itemsize 53 extent refs 1 gen 625327 flags DATA extent data backref root 5 objectid 1123021 offset 6029312 count 1 item 9 key (23416295600128 EXTENT_ITEM 4096) itemoff 15801 itemsize 53 extent refs 1 gen 975337 flags DATA extent data backref root 5 objectid 7334929 offset 0 count 1 item 10 key (23416295604224 EXTENT_ITEM 57344) itemoff 15748 itemsize 53 extent refs 1 gen 572974 flags DATA extent data backref root 5 objectid 4430156 offset 0 count 1 item 11 key (23416295661568 EXTENT_ITEM 106496) itemoff 15695 itemsize 53 extent refs 1 gen 585319 flags DATA extent data backref root 5 objectid 4742398 offset 2490368 count 1 item 12 key (23416295768064 EXTENT_ITEM 4096) itemoff 15642 itemsize 53 extent refs 1 gen 795227 flags DATA extent data backref root 5 objectid 5769382 offset 12288 count 1 item 13 key (23416295772160 EXTENT_ITEM 4096) itemoff 15589 itemsize 53 extent refs 1 gen 795227 flags DATA extent data backref root 5 objectid 5769383 offset 4096 count 1 item 14 key (23416295776256 EXTENT_ITEM 4096) itemoff 15536 itemsize 53 extent refs 1 gen 585370 flags DATA extent data backref root 5 objectid 4742594 offset 1310720 count 1 item 15 key (23416295780352 EXTENT_ITEM 8192) itemoff 15499 itemsize 37 extent refs 1 gen 625327 flags DATA shared data backref parent 32477101621248 count 1 item 16 key (23416295788544 EXTENT_ITEM 151552) itemoff 15446 itemsize 53 extent refs 1 gen 992062 flags DATA extent data backref root 5 objectid 7458028 offset 0 count 1 item 17 key (23416295940096 EXTENT_ITEM 4096) itemoff 15393 itemsize 53 extent refs 1 gen 1027477 flags DATA extent data backref root 5 objectid 7508879 offset 4096 count 1 item 18 key (23416295944192 EXTENT_ITEM 4096) itemoff 15340 itemsize 53 extent refs 1 gen 1023977 flags DATA extent data backref root 5 objectid 7496365 offset 20480 count 1 item 19 key (23416295948288 EXTENT_ITEM 36864) itemoff 15287 itemsize 53 extent refs 1 gen 516177 flags DATA extent data backref root 5 objectid 3897818 offset 12976128 count 1 item 20 key (23416295985152 EXTENT_ITEM 45056) itemoff 15234 itemsize 53 extent refs 1 gen 444976 flags DATA extent data backref root 5 objectid 3591929 offset 12320768 count 1 item 21 key
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 04/03/2017 03:50 PM, Robert Krig wrote: > > > On 03.04.2017 12:11, Robert Krig wrote: >> Hi guys, I seem to have run into a spot of trouble with my btrfs partition. >> >> I've got 4 x 8TB in a RAID1 BTRFS configuration. >> >> I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs >> progs version v4.7.3 >> >> Server has 8GB of Ram. >> >> >> I was running duperemove using a hashfile, which seemed to have run out >> space and aborted. Then I tried a balance operation, with -dusage >> progressively set to 0 1 5 15 30 50, which then aborted, I presume that >> this caused the fs to mount readonly. I only noticed it somewhat later. >> >> I've since rebooted, and I can mount the filesystem OK, but after some >> time (I presume caused by reads or writes) it once again switches to >> readonly. >> >> I tried unmounting/remounting again and running a scrub, but the scrub >> aborts after some time. >> >> > > > I've compiled the newest btrfs-tools version 4.10.2 > > This is what I get when running a btrfsck -p /dev/sda > > hecking filesystem on > /dev/sda > > > UUID: > 8c4f8e26-3442-463f-ad8a-668dfef02593 > > > incorrect offsets 8590 > 1258314415 > > > bad block > 38666170826752 > > > > > > ERROR: errors found in extent allocation tree or chunk > allocation > Speicherzugriffsfehler > > For the non-german speakers: Speicherzugriffsfehler = Memory Access Error > > Dmesg shows this: > > Apr 03 15:47:05 atlas kernel: btrfs[9140]: segfault at 9476b99e ip > 0044c459 sp 7fff556b4b10 error 4 in > btrfs[40+9d000] That's probably because the tool does not verify if the numbers in the fields make sense before using them. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 04/03/2017 12:11 PM, Robert Krig wrote: > Hi guys, I seem to have run into a spot of trouble with my btrfs partition. > > I've got 4 x 8TB in a RAID1 BTRFS configuration. > > I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs > progs version v4.7.3 > > Server has 8GB of Ram. > > > I was running duperemove using a hashfile, which seemed to have run out > space and aborted. Then I tried a balance operation, with -dusage > progressively set to 0 1 5 15 30 50, which then aborted, I presume that > this caused the fs to mount readonly. I only noticed it somewhat later. The balance probably did not cause the issue, but it ran across the invalid metadata page, while digging around in the filesyste and then choked on it. > I've since rebooted, and I can mount the filesystem OK, but after some > time (I presume caused by reads or writes) it once again switches to > readonly. > > I tried unmounting/remounting again and running a scrub, but the scrub > aborts after some time. > > > Here is the output from the kernel when the partition crashes: > > Apr 03 11:32:57 atlas kernel: BTRFS info (device sda): The free space > cache file (37732863967232) is invalid. skip it > Apr 03 11:33:46 atlas kernel: BTRFS critical (device sda): corrupt leaf, > slot offset bad: block=38666170826752, root=1, slot=157 > [...] Note: The root=1 is a lie? Looking at the output of btrfs-debug-tree below, this is definitely a tree block of tree 2, not 1. I have seen this more often, but not looked at the code yet. Maybe some bug in assembling the error message? > I tried running a btrfs-debug-tree -b 38666170826752 /dev/sda > > btrfs-progs > v4.7.3 > > > leaf 38666170826752 items 199 free space 1506 generation 1248226 owner > 2 > > > fs uuid > 8c4f8e26-3442-463f-ad8a-668dfef02593 > > > chunk uuid > 1f04f64e-0ec8-4b39-83d9-a2df75179d3e > > > item 0 key (23416295448576 EXTENT_ITEM 36864) itemoff 16230 > itemsize > 53 > > extent refs 1 gen 671397 flags > DATA > > > extent data backref root 5 objectid 4959957 offset 0 > count > 1 > > > [...] The corruption is at item 157. Can you attach all of the output, or pastebin it? > this goes on and on. I can provide the entire output if thats helpful. Yes. The corruption is in item 157, and then from the point of the itemoff value. This is the offset of the item data in the metadata page. See https://btrfs.wiki.kernel.org/index.php/On-disk_Format#Leaf_Node > Any ideas on what I could do to fix the partition? Is it fixable, or is > it a lost cause? Memory corruption, not on disk corruption. So, either a bitflip, or garbage which ended up on this memory location for whatever reason or a bug in whatever part of the kernel, a pointer in another module gone wonky, etc, which we might learn more about after seeing more of the output. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
On 03.04.2017 12:11, Robert Krig wrote: > Hi guys, I seem to have run into a spot of trouble with my btrfs partition. > > I've got 4 x 8TB in a RAID1 BTRFS configuration. > > I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs > progs version v4.7.3 > > Server has 8GB of Ram. > > > I was running duperemove using a hashfile, which seemed to have run out > space and aborted. Then I tried a balance operation, with -dusage > progressively set to 0 1 5 15 30 50, which then aborted, I presume that > this caused the fs to mount readonly. I only noticed it somewhat later. > > I've since rebooted, and I can mount the filesystem OK, but after some > time (I presume caused by reads or writes) it once again switches to > readonly. > > I tried unmounting/remounting again and running a scrub, but the scrub > aborts after some time. > > I've compiled the newest btrfs-tools version 4.10.2 This is what I get when running a btrfsck -p /dev/sda hecking filesystem on /dev/sda UUID: 8c4f8e26-3442-463f-ad8a-668dfef02593 incorrect offsets 8590 1258314415 bad block 38666170826752 ERROR: errors found in extent allocation tree or chunk allocation Speicherzugriffsfehler For the non-german speakers: Speicherzugriffsfehler = Memory Access Error Dmesg shows this: Apr 03 15:47:05 atlas kernel: btrfs[9140]: segfault at 9476b99e ip 0044c459 sp 7fff556b4b10 error 4 in btrfs[40+9d000] -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"
Hi guys, I seem to have run into a spot of trouble with my btrfs partition. I've got 4 x 8TB in a RAID1 BTRFS configuration. I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs progs version v4.7.3 Server has 8GB of Ram. I was running duperemove using a hashfile, which seemed to have run out space and aborted. Then I tried a balance operation, with -dusage progressively set to 0 1 5 15 30 50, which then aborted, I presume that this caused the fs to mount readonly. I only noticed it somewhat later. I've since rebooted, and I can mount the filesystem OK, but after some time (I presume caused by reads or writes) it once again switches to readonly. I tried unmounting/remounting again and running a scrub, but the scrub aborts after some time. Here is the output from the kernel when the partition crashes: Apr 03 11:32:57 atlas kernel: BTRFS info (device sda): The free space cache file (37732863967232) is invalid. skip it Apr 03 11:33:46 atlas kernel: BTRFS critical (device sda): corrupt leaf, slot offset bad: block=38666170826752, root=1, slot=157 Apr 03 11:33:46 atlas kernel: [ cut here ] Apr 03 11:33:46 atlas kernel: WARNING: CPU: 0 PID: 17810 at /home/zumbi/linux-4.9.13/fs/btrfs/extent-tree.c:6961 __btrfs_free_extent.isra.69+0x152/0xd60 [b Apr 03 11:33:46 atlas kernel: BTRFS: Transaction aborted (error -5) Apr 03 11:33:46 atlas kernel: Modules linked in: xt_multiport iptable_filter ip_tables x_tables binfmt_misc cpufreq_userspace cpufreq_conservative cpufreq_ Apr 03 11:33:46 atlas kernel: ppdev lp parport autofs4 btrfs xor raid6_pq dm_mod md_mod fuse sg sd_mod ahci libahci libata crc32c_intel scsi_mod fan therm Apr 03 11:33:46 atlas kernel: CPU: 0 PID: 17810 Comm: mc Not tainted 4.9.0-0.bpo.2-amd64 #1 Debian 4.9.13-1~bpo8+1 Apr 03 11:33:46 atlas kernel: Hardware name: ASUS All Series/H87M-E, BIOS 0703 10/30/2013 Apr 03 11:33:46 atlas kernel: 97d29cd5 b8ab4bb53a50 Apr 03 11:33:46 atlas kernel: 97a778a4 154c080b2000 b8ab4bb53aa8 8908ad438b40 Apr 03 11:33:46 atlas kernel: 890951b96000 89086c3d4000 97a7791f Apr 03 11:33:46 atlas kernel: Call Trace: Apr 03 11:33:46 atlas kernel: [] ? dump_stack+0x5c/0x77 Apr 03 11:33:46 atlas kernel: [] ? __warn+0xc4/0xe0 Apr 03 11:33:46 atlas kernel: [] ? warn_slowpath_fmt+0x5f/0x80 Apr 03 11:33:46 atlas kernel: [] ? __btrfs_free_extent.isra.69+0x152/0xd60 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? __btrfs_run_delayed_refs+0x466/0x1360 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? set_extent_buffer_dirty+0x64/0xb0 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? btrfs_run_delayed_refs+0x8f/0x2b0 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? btrfs_should_end_transaction+0x3f/0x60 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? btrfs_truncate_inode_items+0x63a/0xde0 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? btrfs_evict_inode+0x4a2/0x5f0 [btrfs] Apr 03 11:33:46 atlas kernel: [] ? evict+0xb6/0x180 Apr 03 11:33:46 atlas kernel: [] ? do_unlinkat+0x148/0x300 Apr 03 11:33:46 atlas kernel: [] ? system_call_fast_compare_end+0xc/0x9b Apr 03 11:33:46 atlas kernel: ---[ end trace 2a45c2819ff7b785 ]--- Apr 03 11:33:46 atlas kernel: BTRFS: error (device sda) in __btrfs_free_extent:6961: errno=-5 IO failure Apr 03 11:33:46 atlas kernel: BTRFS info (device sda): forced readonly Apr 03 11:33:46 atlas kernel: BTRFS: error (device sda) in btrfs_run_delayed_refs:2967: errno=-5 IO failure Apr 03 11:33:50 atlas kernel: BTRFS warning (device sda): failed setting block group ro, ret=-30 Apr 03 11:33:50 atlas kernel: BTRFS warning (device sda): failed setting block group ro, ret=-30 Apr 03 11:33:52 atlas kernel: BTRFS warning (device sda): failed setting block group ro, ret=-30 Apr 03 11:33:53 atlas kernel: BTRFS warning (device sda): Skipping commit of aborted transaction. Apr 03 11:33:53 atlas kernel: BTRFS: error (device sda) in cleanup_transaction:1850: errno=-5 IO failure Apr 03 11:33:53 atlas kernel: BTRFS info (device sda): delayed_refs has NO entry Apr 03 11:33:54 atlas kernel: BTRFS warning (device sda): failed setting block group ro, ret=-30 I tried running a btrfs-debug-tree -b 38666170826752 /dev/sda btrfs-progs v4.7.3 leaf 38666170826752 items 199 free space 1506 generation 1248226 owner 2 fs uuid 8c4f8e26-3442-463f-ad8a-668dfef02593 chunk uuid 1f04f64e-0ec8-4b39-83d9-a2df75179d3e item 0 key (23416295448576 EXTENT_ITEM 36864) itemoff 16230 itemsize 53
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
Am 07.03.2017 um 15:12 schrieb Hans van Kranenburg: On 03/05/2017 11:50 PM, Lukas Tribus wrote: I upgraded btrfs-tools to 4.8.1 as 4.4 didn't have btrfs inspect-internal dump-tree. But I cannot find anything about 5242107641856 in the dump-tree output. What does that mean? I have no idea. It probably means it's gone. Did you use the filesystem read/write? Are the symptoms also gone? Well I read basically everything and copied it to other drivers. Nothing appears corrupted from what I can tell. I didn't write to the pool consciously, although I did not mount it readonly either not that I'm thinking about it ... btrfs check --readonly reports block corruption (and a number of "no inode ref" in files/folders): Checking filesystem on /dev/mapper/sda3_crypt UUID: f50f980e-7640-49c7-bf8d-20d55cfe6005 The following tree block(s) is corrupted in tree 261: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) The following tree block(s) is corrupted in tree 263: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) The following tree block(s) is corrupted in tree 6685: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) The following tree block(s) is corrupted in tree 6879: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) The following tree block(s) is corrupted in tree 6893: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) The following tree block(s) is corrupted in tree 6896: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) found 4080263675904 bytes used err is 1 total csum bytes: 0 total tree bytes: 181780480 total fs tree bytes: 0 total extent tree bytes: 178765824 btree space waste bytes: 49102341 file data blocks allocated: 1545338880 referenced 1545338880 Not sure how btrfs check finds a corrupted block that doesn't appear in the dump-tree output. And I had an additional stack trace on the new btrfs pool I was copying the data to: [873067.780479] BTRFS error (device sdf3): bdev /dev/sdf3 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0 [873067.790639] BTRFS error (device sdf3): bdev /dev/sdf3 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0 [873067.800708] [ cut here ] [873067.800727] WARNING: CPU: 3 PID: 12942 at /build/linux-hwe-6_oOe5/linux-hwe-4.8.0/fs/btrfs/extent-tree.c:6954 __btrfs_free_extent.isra.71+0x2cb/0xcc0 [btrfs] [873067.800730] BTRFS: Transaction aborted (error -5) [873067.800731] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs algif_skcipher af_alg xen_gntdev xen_evtchn xenfs xen_privcmd dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp nls_iso8859_1 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel bridge stp llc intel_rapl_perf serio_raw lpc_ich joydev shpchp nuvoton_cir input_leds mei_me mei rc_core mac_hid ie31200_edac edac_core ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 uas usb_storage btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid mxm_wmi aesni_intel aes_x86_64 i915 glue_helper lrw i2c_algo_bit ablk_helper tg3 cryptd drm_kms_helper syscopyarea sysfillrect [873067.800782] firewire_ohci ptp sysimgblt psmouse firewire_core fb_sys_fops crc_itu_t pps_core ahci drm libahci wmi fjes video [873067.800791] CPU: 3 PID: 12942 Comm: screen Tainted: G W 4.8.0-39-generic #42~16.04.1-Ubuntu [873067.800791] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Extreme6, BIOS P2.80 07/01/2013 [873067.800793] 0200 f56bf709 880259f1f908 8142e043 [873067.800795] 880259f1f958 880259f1f948 8108313b [873067.800797] 1b2a59f1faa0 fffb 01cda76bc000 8802a9fe0d20 [873067.800798] Call Trace: [873067.800803] [] dump_stack+0x63/0x90 [873067.800805] [] __warn+0xcb/0xf0 [873067.800807] [] warn_slowpath_fmt+0x5f/0x80 [873067.800821] [] __btrfs_free_extent.isra.71+0x2cb/0xcc0 [btrfs] [873067.800836] [] ? btrfs_merge_delayed_refs+0x8f/0x6a0 [btrfs] [873067.800846] [] __btrfs_run_delayed_refs+0xb10/0x12c0 [btrfs] [873067.800857] [] ? set_page_dirty+0x58/0xb0 [873067.800869] [] ? set_extent_buffer_dirty+0x78/0xd0 [btrfs] [873067.800879] [] btrfs_run_delayed_refs+0x8e/0x2b0 [btrfs] [873067.800890] [] commit_cowonly_roots+0xae/0x300 [btrfs] [873067.800901] [] ? btrfs_qgroup_account_extents+0x84/0x180 [btrfs] [873067.800911] [] btrfs_commit_transaction+0x573/0xb00 [btrfs] [873067.800920] [] ? start_transaction+0x9e/0x4c0 [btrfs] [873067.800930] [] btrfs_commit_super+0x8f/0xa0 [btrfs] [873067.800939] [] close_ctree+0x2b7/0x360 [btrfs] [873067.800947] [] btrfs_put_super+0x19/0x20 [btrfs] [873067.800949] []
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
On 03/05/2017 11:50 PM, Lukas Tribus wrote: > > Am 24.02.2017 um 01:26 schrieb Hans van Kranenburg: >> >>> Once that is done, I would like to go over the "btrfs recovery" thread >>> and see if it can >>> be applied for my case as well. I will certainly need your help when >>> that time comes... >> We can take a stab at it. > > I upgraded btrfs-tools to 4.8.1 as 4.4 didn't have btrfs > inspect-internal dump-tree. > But I cannot find anything about 5242107641856 in the dump-tree output. > > What does that mean? I have no idea. It probably means it's gone. Did you use the filesystem read/write? Are the symptoms also gone? -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
Hello Hans, Am 24.02.2017 um 01:26 schrieb Hans van Kranenburg: Once that is done, I would like to go over the "btrfs recovery" thread and see if it can be applied for my case as well. I will certainly need your help when that time comes... We can take a stab at it. I upgraded btrfs-tools to 4.8.1 as 4.4 didn't have btrfs inspect-internal dump-tree. But I cannot find anything about 5242107641856 in the dump-tree output. What does that mean? Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
On 02/24/2017 12:47 AM, Lukas Tribus wrote: > Hello Hans, > > > Am 22.02.2017 um 20:40 schrieb Hans van Kranenburg: >> >> Question here is... is it easier for you to nuke the filesystem and >> restore the files from somewhere else, or do you want to figure out >> manually if it's recoverable, and spend some time with dd, hexedit, >> reading struct definitions in btrfs kernel C code etc... >> >> If the regular --repair can't fix it (and it can't do magic if you shoot >> a hole in it with a shotgun), then there's no automated other tool that >> can do it now. >> >> Since it's block 5242107641856 all the time, it might be worthwhile to >> have a look at it. Either it's that block, or there's a bigger mess >> hidden behind it. >> > > Thanks for all the inputs here and on IRC. I now have a good > understanding of what can > and what cannot be done realistically. > > The files are still fully readable and I'm going to backup as much data > as I can over the > next few days. > > Once that is done, I would like to go over the "btrfs recovery" thread > and see if it can > be applied for my case as well. I will certainly need your help when > that time comes... We can take a stab at it. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
Hello Hans, Am 22.02.2017 um 20:40 schrieb Hans van Kranenburg: Question here is... is it easier for you to nuke the filesystem and restore the files from somewhere else, or do you want to figure out manually if it's recoverable, and spend some time with dd, hexedit, reading struct definitions in btrfs kernel C code etc... If the regular --repair can't fix it (and it can't do magic if you shoot a hole in it with a shotgun), then there's no automated other tool that can do it now. Since it's block 5242107641856 all the time, it might be worthwhile to have a look at it. Either it's that block, or there's a bigger mess hidden behind it. Thanks for all the inputs here and on IRC. I now have a good understanding of what can and what cannot be done realistically. The files are still fully readable and I'm going to backup as much data as I can over the next few days. Once that is done, I would like to go over the "btrfs recovery" thread and see if it can be applied for my case as well. I will certainly need your help when that time comes... Thanks for all your help, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
On 02/22/2017 08:44 AM, Lukas Tribus wrote: > Upgrading to 4.8, the FS no longer causes a kernel calltrace and does > not go read-only. It only shows the "corrupt leaf, slot offset bad" > message. > > A scrub completed without errors on 3 devices, while it was aborted on 2 > devices. Not sure why it was aborted, since there is no error message in > dmesg? > > Any suggestions why the scrub was aborted? Maybe because of the "corrupt leaf" error. > # uname -a > Linux srv1-dom0 4.8.0-36-generic #36~16.04.1-Ubuntu SMP Sun Feb 5 > 09:39:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > # btrfs scrub status /storage/users/ > scrub status for f50f980e-7640-49c7-bf8d-20d55cfe6005 > scrub started at Wed Feb 22 00:07:33 2017 and was aborted after > 06:35:42 > total bytes scrubbed: 10.60TiB with 0 errors > /# btrfs scrub status /storage/users/ -d > scrub status for f50f980e-7640-49c7-bf8d-20d55cfe6005 > scrub device /dev/dm-5 (id 1) history > scrub started at Wed Feb 22 00:07:33 2017 and finished after > 06:35:36 > total bytes scrubbed: 2.30TiB with 0 errors > scrub device /dev/dm-6 (id 2) history > scrub started at Wed Feb 22 00:07:33 2017 and finished after > 06:35:30 > total bytes scrubbed: 2.30TiB with 0 errors > scrub device /dev/dm-7 (id 3) history > scrub started at Wed Feb 22 00:07:33 2017 and finished after > 06:35:42 > total bytes scrubbed: 2.30TiB with 0 errors > scrub device /dev/dm-8 (id 4) history > scrub started at Wed Feb 22 00:07:33 2017 and was aborted after > 05:01:37 > total bytes scrubbed: 1.85TiB with 0 errors > scrub device /dev/mapper/sde3_crypt (id 5) history > scrub started at Wed Feb 22 00:07:33 2017 and was aborted after > 05:01:37 > total bytes scrubbed: 1.85TiB with 0 errors > #dmesg | grep BTRFS > [ 929.737119] BTRFS critical (device dm-9): corrupt leaf, slot offset > bad: block=5242107641856,root=1, slot=39 > [19772.594129] BTRFS critical (device dm-9): corrupt leaf, slot offset > bad: block=5242107641856,root=1, slot=39 > [19777.127704] BTRFS critical (device dm-9): corrupt leaf, slot offset > bad: block=5242107641856,root=1, slot=39 > [19777.552191] BTRFS critical (device dm-9): corrupt leaf, slot offset > bad: block=5242107641856,root=1, slot=39 Ok, this is not a csum failure, so probably not the disk giving other data back than what was sent to it when doing the writes, or a disk controller which corrupted the data while writing. And, it's a metadata page, in which part of the entries do not make sense any more to btrfs. Specifically, it's in root 1, which is the tree which contains information about all other subtrees containing metadata, so it's quite an important one. So, the corruption which is now present in there likely happened in memory before writing it out. This is also a scenario in which DUP or RAIDx on disk doesn't help you, because in memory it's stored just once. If this is a bitflip like thing in memory, it would probably be possible to spot it and manually correct it (using a patched btrfschk with bitflip patch, or manually by hexediting++). Another option is memory corruption or a bug somewhere else in the kernel, which lead to a memory address of a pointer being changed, leading to a write to memory end up in the middle of some btrfs metadata waiting to be checksummed and written to disk. Question here is... is it easier for you to nuke the filesystem and restore the files from somewhere else, or do you want to figure out manually if it's recoverable, and spend some time with dd, hexedit, reading struct definitions in btrfs kernel C code etc... If the regular --repair can't fix it (and it can't do magic if you shoot a hole in it with a shotgun), then there's no automated other tool that can do it now. Since it's block 5242107641856 all the time, it might be worthwhile to have a look at it. Either it's that block, or there's a bigger mess hidden behind it. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
I did a "btrfs check" (--readonly): Summary: 589x filetype 1 errors 4, no inode ref (--> Files) 597x filetype 2 errors 4, no inode ref (--> Directories) 1183x root xxx inode YY errors 2001, no inode item, link count wrong I looked at a handful of reported files which are verifiable via public MD5/SHA1 checksums and they are not corrupted, the checksum is correct. Any hints or suggestions would be much appreciated, please see below for the btrfs check output (repeating lines omitted and some filenames redacted): Checking filesystem on /dev/dm-9 UUID: f50f980e-7640-49c7-bf8d-20d55cfe6005 checking extents [.] [...] incorrect offsets 14927 14415 bad block 5242107641856 Errors found in extent allocation tree or chunk allocation checking free space cache [.] [...] checking fs roots [.] [...] incorrect offsets 14927 14415 incorrect offsets 14927 14415 root 261 inode 127094 errors 500, file extent discount, nbytes wrong Found file extent holes: start: 0, len: 499712 unresolved ref dir 127093 index 2 namelen 24 name ABC DE Fghij Klmnopr.tuv filetype 1 errors 4, no inode ref root 261 inode 127095 errors 2001, no inode item, link count wrong unresolved ref dir 127080 index 13 namelen 17 name Whateverdir123456 filetype 2 errors 4, no inode ref root 261 inode 127097 errors 2001, no inode item, link count wrong unresolved ref dir 127080 index 14 namelen 12 name WhateverDirectory2 filetype 2 errors 4, no inode ref root 261 inode 127099 errors 2001, no inode item, link count wrong unresolved ref dir 127080 index 15 namelen 11 name AnyDir filetype 2 errors 4, no inode ref root 261 inode 127105 errors 2001, no inode item, link count wrong unresolved ref dir 127080 index 16 namelen 10 name AnotherDir filetype 2 errors 4, no inode ref root 261 inode 127107 errors 2001, no inode item, link count wrong unresolved ref dir 127080 index 17 namelen 11 name Folder11 filetype 2 errors 4, no inode ref root 261 inode 127112 errors 2001, no inode item, link count wrong unresolved ref dir 126959 index 51 namelen 11 name Folder120 filetype 2 errors 4, no inode ref root 261 inode 127114 errors 2001, no inode item, link count wrong unresolved ref dir 126146 index 40 namelen 13 name GVC-dir filetype 2 errors 4, no inode ref root 261 inode 127396 errors 2001, no inode item, link count wrong unresolved ref dir 126146 index 41 namelen 4 name G3-dir filetype 2 errors 4, no inode ref root 261 inode 127527 errors 2001, no inode item, link count wrong unresolved ref dir 126146 index 42 namelen 11 name Hello Dir 2 filetype 2 errors 4, no inode ref root 261 inode 127535 errors 2001, no inode item, link count wrong unresolved ref dir 126146 index 43 namelen 4 name Hellodir filetype 2 errors 4, no inode ref root 261 inode 127573 errors 2001, no inode item, link count wrong unresolved ref dir 126146 index 44 namelen 6 name Hello 2 filetype 2 errors 4, no inode ref root 261 inode 127620 errors 2001, no inode item, link count wrong [...] root 261 inode 177273 errors 2001, no inode item, link count wrong unresolved ref dir 23439 index 23 namelen 24 name Firefox Setup 51.0.1.exe filetype 1 errors 4, no inode ref root 261 inode 177275 errors 2001, no inode item, link count wrong unresolved ref dir 23439 index 26 namelen 27 name Firefox Setup 45.7.0esr.exe filetype 1 errors 4, no inode ref root 261 inode 180457 errors 2001, no inode item, link count wrong [...] checking fs roots [o] incorrect offsets 14927 14415 checking fs roots [.] [...] checking fs roots [o] The following tree block(s) is corrupted in tree 263: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) checking fs roots [o] incorrect offsets 14927 14415 checking fs roots [O] The following tree block(s) is corrupted in tree 6685: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) checking fs roots [o] checking fs roots [.] incorrect offsets 14927 14415 The following tree block(s) is corrupted in tree 6879: tree block bytenr: 5242107641856, level: 0, node key: (5241902333952, 169, 0) checking fs roots [o] incorrect offsets 14927 14415 incorrect offsets 14927 14415 root 6893 inode 127094 errors 500, file extent discount, nbytes wrong Found file extent holes: start: 0, len: 499712 unresolved ref dir 127093 index 2 namelen 24 name ABC DE Fghij Klmnopr.tuv filetype 1 errors 4, no inode ref root 6893 inode 127095 errors 2001, no inode item, link count wrong unresolved ref dir 127080 index 13 namelen 17 name Whateverdir123456 filetype 2 errors 4, no inode ref root 6893 inode 127097 errors 2001, no inode item, link count wrong [...] root 6893 inode 177273 errors 2001, no inode item, link count wrong unresolved ref dir 23439 index 23 namelen 24 name Firefox Setup 51.0.1.exe filetype 1 errors 4, no inode ref root 6893 inode 177275 errors 2001, no inode item, link count wrong unresolved ref dir 23439 index 26
Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only
Upgrading to 4.8, the FS no longer causes a kernel calltrace and does not go read-only. It only shows the "corrupt leaf, slot offset bad" message. A scrub completed without errors on 3 devices, while it was aborted on 2 devices. Not sure why it was aborted, since there is no error message in dmesg? Any suggestions why the scrub was aborted? # uname -a Linux srv1-dom0 4.8.0-36-generic #36~16.04.1-Ubuntu SMP Sun Feb 5 09:39:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux # btrfs scrub status /storage/users/ scrub status for f50f980e-7640-49c7-bf8d-20d55cfe6005 scrub started at Wed Feb 22 00:07:33 2017 and was aborted after 06:35:42 total bytes scrubbed: 10.60TiB with 0 errors /# btrfs scrub status /storage/users/ -d scrub status for f50f980e-7640-49c7-bf8d-20d55cfe6005 scrub device /dev/dm-5 (id 1) history scrub started at Wed Feb 22 00:07:33 2017 and finished after 06:35:36 total bytes scrubbed: 2.30TiB with 0 errors scrub device /dev/dm-6 (id 2) history scrub started at Wed Feb 22 00:07:33 2017 and finished after 06:35:30 total bytes scrubbed: 2.30TiB with 0 errors scrub device /dev/dm-7 (id 3) history scrub started at Wed Feb 22 00:07:33 2017 and finished after 06:35:42 total bytes scrubbed: 2.30TiB with 0 errors scrub device /dev/dm-8 (id 4) history scrub started at Wed Feb 22 00:07:33 2017 and was aborted after 05:01:37 total bytes scrubbed: 1.85TiB with 0 errors scrub device /dev/mapper/sde3_crypt (id 5) history scrub started at Wed Feb 22 00:07:33 2017 and was aborted after 05:01:37 total bytes scrubbed: 1.85TiB with 0 errors #dmesg | grep BTRFS [ 929.737119] BTRFS critical (device dm-9): corrupt leaf, slot offset bad: block=5242107641856,root=1, slot=39 [19772.594129] BTRFS critical (device dm-9): corrupt leaf, slot offset bad: block=5242107641856,root=1, slot=39 [19777.127704] BTRFS critical (device dm-9): corrupt leaf, slot offset bad: block=5242107641856,root=1, slot=39 [19777.552191] BTRFS critical (device dm-9): corrupt leaf, slot offset bad: block=5242107641856,root=1, slot=39 # -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS critical: corrupt leaf, slot offset bad; then read-only
Hi list! I have btrfs pool consisting of 5x 2,72 TiB LUKS (dm-crypt) partitions in RAID1, mounted on Linux 4.4 with btrfs-progs 4.4. I never had any crashes or power loss here, but recently about every 60 - 120 minutes (while in use) btrfs detects corruptions, aborts the transaction and drops to read-only mode. btrfs still mounts normally without any special options (it does take about 60 seconds, which I guess is normal for this kind of size). All LUKS partitions have at least 400GiB of free space. I don't see any HW problems here; I doubt there is a corruption coming from the LUKS partition. I did test the RAM but it seems fine in multiple memtest86+ amd memtest86 runs. Are there any known bugs in 4.4? Any suggestions would be greatly appreciated! I have to admit I did not regularly scrub. Thanks, Lukas --- ~# uname -a Linux srv1-dom0 4.4.0-63-generic #84-Ubuntu SMP Wed Feb 1 17:20:32 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux ~# btrfs --version btrfs-progs v4.4 ~# btrfs fi show Label: 'dom0-os' uuid: e475636c-21e0-4563-87d6-91f03c519a62 Total devices 5 FS bytes used 3.52GiB devid1 size 10.00GiB used 3.53GiB path /dev/sda2 devid2 size 10.00GiB used 4.25GiB path /dev/sdb2 devid3 size 10.00GiB used 3.28GiB path /dev/sdc2 devid4 size 10.00GiB used 4.00GiB path /dev/sdd2 devid5 size 10.00GiB used 4.00GiB path /dev/sde2 Label: 'storage_pool' uuid: f50f980e-7640-49c7-bf8d-20d55cfe6005 Total devices 5 FS bytes used 5.77TiB devid1 size 2.72TiB used 2.31TiB path /dev/mapper/sda3_crypt devid2 size 2.72TiB used 2.31TiB path /dev/mapper/sdb3_crypt devid3 size 2.72TiB used 2.31TiB path /dev/mapper/sdc3_crypt devid4 size 2.72TiB used 2.31TiB path /dev/mapper/sdd3_crypt devid5 size 2.72TiB used 2.31TiB path /dev/mapper/sde3_crypt ~# btrfs fi df /storage/users/ Data, RAID1: total=5.77TiB, used=5.76TiB System, RAID1: total=32.00MiB, used=832.00KiB Metadata, RAID1: total=8.00GiB, used=6.96GiB GlobalReserve, single: total=512.00MiB, used=0.00B ~# ~# partial dmesg: [ 1509.033492] BTRFS: device label storage_pool devid 1 transid 238135 /dev/dm-5 [ 1510.498804] BTRFS: device label storage_pool devid 2 transid 238135 /dev/dm-6 [ 1511.980968] BTRFS: device label storage_pool devid 3 transid 238135 /dev/dm-7 [ 1513.461799] BTRFS: device label storage_pool devid 4 transid 238135 /dev/dm-8 [ 1514.838757] BTRFS: device label storage_pool devid 5 transid 238135 /dev/dm-9 [ 1517.726471] BTRFS info (device dm-9): btrfs: use no compression [ 1517.726477] BTRFS info (device dm-9): disk space caching is enabled [ 1517.726479] BTRFS: has skinny extents [ 1569.598633] BTRFS: checking UUID tree [ 3540.825747] BTRFS critical (device dm-9): corrupt leaf, slot offset bad: block=5242107641856,root=1, slot=39 [ 3540.836168] BTRFS critical (device dm-9): corrupt leaf, slot offset bad: block=5242107641856,root=1, slot=39 [ 3540.846413] [ cut here ] [ 3540.846432] WARNING: CPU: 2 PID: 2757 at /build/linux-mPTI9s/linux-4.4.0/fs/btrfs/extent-tree.c:2930 btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs]() [ 3540.846433] BTRFS: Transaction aborted (error -5) [ 3540.846434] Modules linked in: algif_skcipher af_alg xen_gntdev xen_evtchn xenfs xen_privcmd drbg ansi_cprng dm_crypt nls_iso8859_1 bridge stp llc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel serio_raw joydev input_leds nuvoton_cir 8250_fintek ie31200_edac mac_hid rc_core lpc_ich edac_core shpchp mei_me mei ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid mxm_wmi i915 i2c_algo_bit drm_kms_helper aesni_intel aes_x86_64 glue_helper syscopyarea sysfillrect firewire_ohci sysimgblt firewire_core fb_sys_fops lrw psmouse [ 3540.846466] tg3 gf128mul ablk_helper cryptd crc_itu_t ptp ahci drm pps_core libahci fjes wmi video [ 3540.846473] CPU: 2 PID: 2757 Comm: btrfs-transacti Not tainted 4.4.0-63-generic #84-Ubuntu [ 3540.846475] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Extreme6, BIOS P2.80 07/01/2013 [ 3540.846476] 0200 02709bc3 88007615fc90 813f8083 [ 3540.846478] 88007615fcd8 c048d498 88007615fcc8 810812d2 [ 3540.846479] 8802adf562f8 8802a9c71800 8800056caef0 [ 3540.846481] Call Trace: [ 3540.846486] [] dump_stack+0x63/0x90 [ 3540.846489] [] warn_slowpath_common+0x82/0xc0 [ 3540.846491] [] warn_slowpath_fmt+0x5c/0x80 [ 3540.846500] [] ? __btrfs_run_delayed_refs+0xcdd/0x1220 [btrfs] [ 3540.846509] [] btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs] [ 3540.846520] [] commit_cowonly_roots+0x22b/0x2c2
Re: corrupt leaf, slot offset bad
Am Tue, 11 Oct 2016 07:09:49 -0700 schrieb Liu Bo: > On Tue, Oct 11, 2016 at 02:48:09PM +0200, David Sterba wrote: > > Hi, > > > > looks like a lot of random bitflips. > > > > On Mon, Oct 10, 2016 at 11:50:14PM +0200, a...@aron.ws wrote: > > > item 109 has a few strange chars in its name (and it's > > > truncated): 1-x86_64.pkg.tar.xz 0x62 0x14 0x0a 0x0a > > > > > > item 105 key (261 DIR_ITEM 54556048) itemoff 11723 > > > itemsize 72 location key (606286 INODE_ITEM 0) type FILE > > > namelen 42 datalen 0 name: > > > python2-gobject-3.20.1-1-x86_64.pkg.tar.xz item 106 key (261 > > > DIR_ITEM 56363628) itemoff 11660 itemsize 63 location key (894298 > > > INODE_ITEM 0) type FILE namelen 33 datalen 0 name: > > > unrar-1:5.4.5-1-x86_64.pkg.tar.xz item 107 key (261 DIR_ITEM > > > 66963651) itemoff 11600 itemsize 60 location key (1178 INODE_ITEM > > > 0) type FILE namelen 30 datalen 0 name: > > > glibc-2.23-5-x86_64.pkg.tar.xz item 108 key (261 DIR_ITEM > > > 68561395) itemoff 11532 itemsize 68 location key (660578 > > > INODE_ITEM 0) type FILE namelen 38 datalen 0 name: > > > squashfs-tools-4.3-4-x86_64.pkg.tar.xz item 109 key (261 DIR_ITEM > > > 76859450) itemoff 11483 itemsize 65 location key (2397184 > > > UNKNOWN.0 7091317839824617472) type 45 namelen 13102 datalen > > > 13358 name: 1-x86_64.pkg.tar.xzb > > > > namelen must be smaller than 255, but the number itself does not > > look like a bitflip (0x332e), the name looks like a fragment of. > > > > The location key is random garbage, likely an overwritten memory, > > 7091317839824617472 == 0x62696c010023 contains ascii 'bil', the > > key type is unknown but should be INODE_ITEM. > > > > > data > > > item 110 key (261 DIR_ITEM 9799832789237604651) itemoff > > > 11405 itemsize 62 > > > location key (388547 INODE_ITEM 0) type FILE > > > namelen 32 datalen 0 name: > > > intltool-0.51.0-1-any.pkg.tar.xz item 111 key (261 DIR_ITEM > > > 81211850) itemoff 11344 itemsize 131133 > > > > itemsize 131133 == 0x2003d is a clear bitflip, 0x3d == 61, > > corresponds to the expected item size. > > > > There's possibly other random bitflips in the keys or other > > structures. It's hard to estimate the damage and thus the scope of > > restorable data. > > It makes sense since this's a ssd we may have only one copy for > metadata. > > Thanks, > > -liubo >From this point of view it doesn't make sense to store only one copy of meta data on SSD... The bit flip probably happened in RAM when taking the other garbage into account, so dup meta data could have helped here. If the SSD firmware would collapse duplicate meta data into single blobs, that's perfectly fine. If the dup meta data arrives with bits flipped, it won't be deduplicated. So this is fine, too. BTW: I cannot believe that SSD firmwares really do the quite expensive job of deduplication other than maybe internal compression. Maybe there are some drives out there but most won't deduplicate. It's just too little gain for too much complexity. So I personally would always switch on duplicate meta data even for SSD. It shouldn't add to wear leveling too much if you do the usual SSD optimization anyways (like noatime). PS: I suggest doing an extensive memtest86 before trying any repairs on this system... Are you probably mixing different model DIMMs in dual channel slots? Most of the times I've seen bitflips, this was the culprit... -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corrupt leaf, slot offset bad
On Tue, Oct 11, 2016 at 02:48:09PM +0200, David Sterba wrote: > Hi, > > looks like a lot of random bitflips. > > On Mon, Oct 10, 2016 at 11:50:14PM +0200, a...@aron.ws wrote: > > item 109 has a few strange chars in its name (and it's truncated): > > 1-x86_64.pkg.tar.xz 0x62 0x14 0x0a 0x0a > > > > item 105 key (261 DIR_ITEM 54556048) itemoff 11723 itemsize 72 > > location key (606286 INODE_ITEM 0) type FILE > > namelen 42 datalen 0 name: > > python2-gobject-3.20.1-1-x86_64.pkg.tar.xz > > item 106 key (261 DIR_ITEM 56363628) itemoff 11660 itemsize 63 > > location key (894298 INODE_ITEM 0) type FILE > > namelen 33 datalen 0 name: unrar-1:5.4.5-1-x86_64.pkg.tar.xz > > item 107 key (261 DIR_ITEM 66963651) itemoff 11600 itemsize 60 > > location key (1178 INODE_ITEM 0) type FILE > > namelen 30 datalen 0 name: glibc-2.23-5-x86_64.pkg.tar.xz > > item 108 key (261 DIR_ITEM 68561395) itemoff 11532 itemsize 68 > > location key (660578 INODE_ITEM 0) type FILE > > namelen 38 datalen 0 name: > > squashfs-tools-4.3-4-x86_64.pkg.tar.xz > > item 109 key (261 DIR_ITEM 76859450) itemoff 11483 itemsize 65 > > location key (2397184 UNKNOWN.0 7091317839824617472) type 45 > > namelen 13102 datalen 13358 name: 1-x86_64.pkg.tar.xzb > > namelen must be smaller than 255, but the number itself does not look > like a bitflip (0x332e), the name looks like a fragment of. > > The location key is random garbage, likely an overwritten memory, > 7091317839824617472 == 0x62696c010023 contains ascii 'bil', the key > type is unknown but should be INODE_ITEM. > > > data > > item 110 key (261 DIR_ITEM 9799832789237604651) itemoff 11405 itemsize > > 62 > > location key (388547 INODE_ITEM 0) type FILE > > namelen 32 datalen 0 name: intltool-0.51.0-1-any.pkg.tar.xz > > item 111 key (261 DIR_ITEM 81211850) itemoff 11344 itemsize 131133 > > itemsize 131133 == 0x2003d is a clear bitflip, 0x3d == 61, corresponds > to the expected item size. > > There's possibly other random bitflips in the keys or other structures. > It's hard to estimate the damage and thus the scope of restorable data. It makes sense since this's a ssd we may have only one copy for metadata. Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corrupt leaf, slot offset bad
Hi, looks like a lot of random bitflips. On Mon, Oct 10, 2016 at 11:50:14PM +0200, a...@aron.ws wrote: > item 109 has a few strange chars in its name (and it's truncated): > 1-x86_64.pkg.tar.xz 0x62 0x14 0x0a 0x0a > > item 105 key (261 DIR_ITEM 54556048) itemoff 11723 itemsize 72 > location key (606286 INODE_ITEM 0) type FILE > namelen 42 datalen 0 name: > python2-gobject-3.20.1-1-x86_64.pkg.tar.xz > item 106 key (261 DIR_ITEM 56363628) itemoff 11660 itemsize 63 > location key (894298 INODE_ITEM 0) type FILE > namelen 33 datalen 0 name: unrar-1:5.4.5-1-x86_64.pkg.tar.xz > item 107 key (261 DIR_ITEM 66963651) itemoff 11600 itemsize 60 > location key (1178 INODE_ITEM 0) type FILE > namelen 30 datalen 0 name: glibc-2.23-5-x86_64.pkg.tar.xz > item 108 key (261 DIR_ITEM 68561395) itemoff 11532 itemsize 68 > location key (660578 INODE_ITEM 0) type FILE > namelen 38 datalen 0 name: > squashfs-tools-4.3-4-x86_64.pkg.tar.xz > item 109 key (261 DIR_ITEM 76859450) itemoff 11483 itemsize 65 > location key (2397184 UNKNOWN.0 7091317839824617472) type 45 > namelen 13102 datalen 13358 name: 1-x86_64.pkg.tar.xzb namelen must be smaller than 255, but the number itself does not look like a bitflip (0x332e), the name looks like a fragment of. The location key is random garbage, likely an overwritten memory, 7091317839824617472 == 0x62696c010023 contains ascii 'bil', the key type is unknown but should be INODE_ITEM. > data > item 110 key (261 DIR_ITEM 9799832789237604651) itemoff 11405 itemsize > 62 > location key (388547 INODE_ITEM 0) type FILE > namelen 32 datalen 0 name: intltool-0.51.0-1-any.pkg.tar.xz > item 111 key (261 DIR_ITEM 81211850) itemoff 11344 itemsize 131133 itemsize 131133 == 0x2003d is a clear bitflip, 0x3d == 61, corresponds to the expected item size. There's possibly other random bitflips in the keys or other structures. It's hard to estimate the damage and thus the scope of restorable data. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corrupt leaf, slot offset bad
Hi liubo, item 109 has a few strange chars in its name (and it's truncated): 1-x86_64.pkg.tar.xz 0x62 0x14 0x0a 0x0a item 105 key (261 DIR_ITEM 54556048) itemoff 11723 itemsize 72 location key (606286 INODE_ITEM 0) type FILE namelen 42 datalen 0 name: python2-gobject-3.20.1-1-x86_64.pkg.tar.xz item 106 key (261 DIR_ITEM 56363628) itemoff 11660 itemsize 63 location key (894298 INODE_ITEM 0) type FILE namelen 33 datalen 0 name: unrar-1:5.4.5-1-x86_64.pkg.tar.xz item 107 key (261 DIR_ITEM 66963651) itemoff 11600 itemsize 60 location key (1178 INODE_ITEM 0) type FILE namelen 30 datalen 0 name: glibc-2.23-5-x86_64.pkg.tar.xz item 108 key (261 DIR_ITEM 68561395) itemoff 11532 itemsize 68 location key (660578 INODE_ITEM 0) type FILE namelen 38 datalen 0 name: squashfs-tools-4.3-4-x86_64.pkg.tar.xz item 109 key (261 DIR_ITEM 76859450) itemoff 11483 itemsize 65 location key (2397184 UNKNOWN.0 7091317839824617472) type 45 namelen 13102 datalen 13358 name: 1-x86_64.pkg.tar.xzb data item 110 key (261 DIR_ITEM 9799832789237604651) itemoff 11405 itemsize 62 location key (388547 INODE_ITEM 0) type FILE namelen 32 datalen 0 name: intltool-0.51.0-1-any.pkg.tar.xz item 111 key (261 DIR_ITEM 81211850) itemoff 11344 itemsize 131133 location key (893669 INODE_ITEM 0) type FILE namelen 31 datalen 0 name: babl-0.1.16-1-x86_64.pkg.tar.xz location key (388547 INODE_ITEM 0) type FILE Thanks, Aron On 2016-10-10 23:03, Liu Bo wrote: On Mon, Oct 10, 2016 at 08:57:19PM +0200, aron@aron.wswrote: Hi all, I've been using btrfs for a few months now, without any problems. During work, I've noticed segfaults, when accessing my root directory. As my home directory contents was readable, I've decided to reboot. That was the worst decision, as now I can't copy my data off the SSD. It seems like a memory isse. I have backups, but its ~2 weeks old. What I did is a dd dump immediately. Have latest kernel and latest progs built from source now, but :S ... This is what I've got: When mounting: BTRFS critical (device: sdb2): corrupt leaf, slot offset bad: block=610107392,root=1, slot=108 This indicates that leaf 610107392 is corrupted somehow because its slot 108's 'start offset in leaf' and slot 109's 'end offset in leaf' doesn't match with each other, the cause is not shown though. find-root prints nothing to the stdout ofter 2 hours. running btrfs inspect-internal dump-tre> 92 /dev/sdb2 leaf 610107392 items 188 free spac tion 90792 owner 5 owner 5 means that it's not a tree root leaf, > ee leaf. fs uuid 2cc75a87-b22b-448e-80d4-383a9f42deed chunk uuid a5b09a2a-da3d-4049-91ba-4fe66932907b item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 inode generation 3 transid 90769 size 144 nbytes 16384 block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 inode ref index 0 namelen 2 name: .. item 2 key (256 DIR_ITEM 145260132) itemoff 16078 itemsize 33 location key (265 INODE_ITEM 0) type DIR namelen 3 datalen 0 name: dev item 3 key (256 DIR_ITEM 217684952) itemoff 16045 itemsize 33 location key (266 INODE_ITEM 0) type DIR namelen 3 datalen 0 name: run item 4 key (256 DIR_ITEM 308198373) itemoff 16011 itemsize 34 location key (257 ) type DIR ... Maybe we can check the content of item 108 and item 109 in this output from 'dump-tree'? Thanks, -liubo item 111 key (261 DIR_ITEM 81211850) itemoff 11344 itemsize 131133 location key (893669 INODE_ITEM 0) type FILE namelen 31 datalen 0 name: babl-0.1.16-1-x86_64.pkg.tar.xz location key (388547 INODE_ITEM 0) type FILE namelen 32 datalen 0 name: intltool-0.51.0-1-any.pkg.tar.xz ... namelen 30 datalen 0 name: glibc-2.24-2-x86_64.pkg.tar.xz location key (893658 INODE_ITEM 0) type FILE namelen 36 datalen 0 name: procps-ng-3.3.12-1-x86_64.pkg.tar.xz location key (EXTENT_TREE UNKNOWN.3 36094832640) type 12 namelen 0 datalen 0 name: location key (291 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (18556457741975552 UNKNOWN.0 0) type 0 namelen 0 datalen 7134 name: data location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: segfault running restore: incorrect offsets 11532
Re: corrupt leaf, slot offset bad
On Mon, Oct 10, 2016 at 08:57:19PM +0200, a...@aron.ws wrote: > Hi all, > > I've been using btrfs for a few months now, without any problems. During > work, I've noticed segfaults, when accessing my root directory. As my home > directory contents was readable, I've decided to reboot. That was the worst > decision, as now I can't copy my data off the SSD. It seems like a memory > isse. I have backups, but its ~2 weeks old. What I did is a dd dump > immediately. Have latest kernel and latest progs built from source now, but > :S ... > > This is what I've got: > > When mounting: > > BTRFS critical (device: sdb2): corrupt leaf, slot offset bad: > block=610107392,root=1, slot=108 This indicates that leaf 610107392 is corrupted somehow because its slot 108's 'start offset in leaf' and slot 109's 'end offset in leaf' doesn't match with each other, the cause is not shown though. > > find-root prints nothing to the stdout ofter 2 hours. > > running btrfs inspect-internal dump-tree -b 610107392 /dev/sdb2 > > leaf 610107392 items 188 free space 1690 generation 90792 owner 5 owner 5 means that it's not a tree root leaf, it's a fs tree leaf. > fs uuid 2cc75a87-b22b-448e-80d4-383a9f42deed > chunk uuid a5b09a2a-da3d-4049-91ba-4fe66932907b > item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 > inode generation 3 transid 90769 size 144 nbytes 16384 > block group 0 mode 40755 links 1 uid 0 gid 0 > rdev 0 flags 0x0(none) > item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 > inode ref index 0 namelen 2 name: .. > item 2 key (256 DIR_ITEM 145260132) itemoff 16078 itemsize 33 > location key (265 INODE_ITEM 0) type DIR > namelen 3 datalen 0 name: dev > item 3 key (256 DIR_ITEM 217684952) itemoff 16045 itemsize 33 > location key (266 INODE_ITEM 0) type DIR > namelen 3 datalen 0 name: run > item 4 key (256 DIR_ITEM 308198373) itemoff 16011 itemsize 34 > location key (257 INODE_ITEM 0) type DIR > > ... Maybe we can check the content of item 108 and item 109 in this output from 'dump-tree'? Thanks, -liubo > item 111 key (261 DIR_ITEM 81211850) itemoff 11344 itemsize 131133 > location key (893669 INODE_ITEM 0) type FILE > namelen 31 datalen 0 name: babl-0.1.16-1-x86_64.pkg.tar.xz > location key (388547 INODE_ITEM 0) type FILE > namelen 32 datalen 0 name: intltool-0.51.0-1-any.pkg.tar.xz > ... > namelen 30 datalen 0 name: glibc-2.24-2-x86_64.pkg.tar.xz > location key (893658 INODE_ITEM 0) type FILE > namelen 36 datalen 0 name: procps-ng-3.3.12-1-x86_64.pkg.tar.xz > location key (EXTENT_TREE UNKNOWN.3 36094832640) type 12 > namelen 0 datalen 0 name: > location key (291 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (18556457741975552 UNKNOWN.0 0) type 0 > namelen 0 datalen 7134 name: > data > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > location key (0 UNKNOWN.0 0) type 0 > namelen 0 datalen 0 name: > > > > segfault > > > running restore: > > incorrect offsets 11532 11548 > Error searching -1 > > Tried every rescue, check commands, in different variations ... nothing. It > seems that the root leaf (?) has some garbage, tried using the corrupt-block > utility, to mark the item dirty got the same error: incorrect offsets. > > The only thing I've managed is to restore a part of the /etc directory, > with: btrfs restore -i -f 610123776 - -d /dev/sdb2 /mnt/restore > > I'm still trying to learn how the data is structured now, but my problem is > that I can't figure out how to calculate the leaf positions, using the > dump-tree output ... > > I need some kind tool/script that can recursively rescue the structure from >
corrupt leaf, slot offset bad
Hi all, I've been using btrfs for a few months now, without any problems. During work, I've noticed segfaults, when accessing my root directory. As my home directory contents was readable, I've decided to reboot. That was the worst decision, as now I can't copy my data off the SSD. It seems like a memory isse. I have backups, but its ~2 weeks old. What I did is a dd dump immediately. Have latest kernel and latest progs built from source now, but :S ... This is what I've got: When mounting: BTRFS critical (device: sdb2): corrupt leaf, slot offset bad: block=610107392,root=1, slot=108 find-root prints nothing to the stdout ofter 2 hours. running btrfs inspect-internal dump-tree -b 610107392 /dev/sdb2 leaf 610107392 items 188 free space 1690 generation 90792 owner 5 fs uuid 2cc75a87-b22b-448e-80d4-383a9f42deed chunk uuid a5b09a2a-da3d-4049-91ba-4fe66932907b item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 inode generation 3 transid 90769 size 144 nbytes 16384 block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 inode ref index 0 namelen 2 name: .. item 2 key (256 DIR_ITEM 145260132) itemoff 16078 itemsize 33 location key (265 INODE_ITEM 0) type DIR namelen 3 datalen 0 name: dev item 3 key (256 DIR_ITEM 217684952) itemoff 16045 itemsize 33 location key (266 INODE_ITEM 0) type DIR namelen 3 datalen 0 name: run item 4 key (256 DIR_ITEM 308198373) itemoff 16011 itemsize 34 location key (257 INODE_ITEM 0) type DIR ... item 111 key (261 DIR_ITEM 81211850) itemoff 11344 itemsize 131133 location key (893669 INODE_ITEM 0) type FILE namelen 31 datalen 0 name: babl-0.1.16-1-x86_64.pkg.tar.xz location key (388547 INODE_ITEM 0) type FILE namelen 32 datalen 0 name: intltool-0.51.0-1-any.pkg.tar.xz ... namelen 30 datalen 0 name: glibc-2.24-2-x86_64.pkg.tar.xz location key (893658 INODE_ITEM 0) type FILE namelen 36 datalen 0 name: procps-ng-3.3.12-1-x86_64.pkg.tar.xz location key (EXTENT_TREE UNKNOWN.3 36094832640) type 12 namelen 0 datalen 0 name: location key (291 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (18556457741975552 UNKNOWN.0 0) type 0 namelen 0 datalen 7134 name: data location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: location key (0 UNKNOWN.0 0) type 0 namelen 0 datalen 0 name: segfault running restore: incorrect offsets 11532 11548 Error searching -1 Tried every rescue, check commands, in different variations ... nothing. It seems that the root leaf (?) has some garbage, tried using the corrupt-block utility, to mark the item dirty got the same error: incorrect offsets. The only thing I've managed is to restore a part of the /etc directory, with: btrfs restore -i -f 610123776 - -d /dev/sdb2 /mnt/restore I'm still trying to learn how the data is structured now, but my problem is that I can't figure out how to calculate the leaf positions, using the dump-tree output ... I need some kind tool/script that can recursively rescue the structure from a defined leaf. (can this be done?) Any help would be appreciated! :) Thanks! Yours, Aron -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash, boot mount failure: "corrupt leaf, slot offset bad"
I had another btrfs crash with identical symptoms. This time I managed to reproduce the crash and identify the root cause. It turns out that the Apple boot firmware is buggy and leaves wifi DMA turned on, which randomly corrupts memory after Linux is running. https://bugzilla.kernel.org/show_bug.cgi?id=111781 So btrfs is not to blame after all. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash, boot mount failure: "corrupt leaf, slot offset bad"
On 5 January 2016 at 01:57, Qu Wenruowrote: >> >> Data, single: total=106.79GiB, used=82.01GiB >> System, single: total=4.00MiB, used=16.00KiB >> Metadata, single: total=2.01GiB, used=1.51GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B > > > That's the btrfs fi df misleading output confusing you. > > In fact, your metadata is already used up without available space. > GlobalReserve should also be counted as Metadata *used* space. Thanks for the explanation - the FAQ[1] misleads when it describes GlobalReserve as "The block reserve is only virtual and is not stored on the devices." - which sounds like the reserve is literally not stored on the drive. The FAQ[2] also suggests that the free space in metadata can be less than the block reserve total: "If the free space in metadata is less than or equal to the block reserve value (typically 512 MiB, but might be something else on a particularly small or large filesystem), then it's close to full." But what you are saying is that this is wrong and the free space in metadata can never be less than the block reserve, because the block reserve includes the metadata free space? [1] https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_GlobalReserve_and_why_does_.27btrfs_fi_df.27_show_it_as_single_even_on_RAID_filesystems.3F [2] https://btrfs.wiki.kernel.org/index.php/FAQ#if_your_device_is_large_.28.3E16GiB.29 > Good, 5GiB freed space, it can be allocated for metadata to slightly reduce > the metadata pressure. > > But not for long. > The root resolve will be, add more space into this btrfs. Yes but this is a 128GB SSD and metadata could have been reallocated from some of the 25GB of free space allocated to data. Even with a bigger drive, it is possible that chunks could be allocated to data, and then later operations requiring more metadata will still run out (running out of metadata space seems to be a reasonably common occurrence judging by the number of "why is btrfs reporting no space when I have space free" questions). The file system shouldn't be corrupted when that happens. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash, boot mount failure: "corrupt leaf, slot offset bad"
Chris Bainbridge wrote on 2016/01/05 13:41 +: On 5 January 2016 at 01:57, Qu Wenruowrote: Data, single: total=106.79GiB, used=82.01GiB System, single: total=4.00MiB, used=16.00KiB Metadata, single: total=2.01GiB, used=1.51GiB GlobalReserve, single: total=512.00MiB, used=0.00B That's the btrfs fi df misleading output confusing you. In fact, your metadata is already used up without available space. GlobalReserve should also be counted as Metadata *used* space. Thanks for the explanation - the FAQ[1] misleads when it describes GlobalReserve as "The block reserve is only virtual and is not stored on the devices." - which sounds like the reserve is literally not stored on the drive. In fact FAQ description is not wrong either. GlobalReserve is not stored in any where, that's true. Since it doesn't takes space(unless its used is not 0), it is stored no where and FAQ is right. Metadata allocation algorithm will try its best to keep enough free space for GlobalReserve. So for end user, space you can't directly use is no different from used space. The FAQ[2] also suggests that the free space in metadata can be less than the block reserve total: "If the free space in metadata is less than or equal to the block reserve value (typically 512 MiB, but might be something else on a particularly small or large filesystem), then it's close to full." But what you are saying is that this is wrong and the free space in metadata can never be less than the block reserve, because the block reserve includes the metadata free space? Sorry for the confusion. Yes, it's possible for available metadata space less than global reserve space. But when it happens, your used space in GlobalReserved is not 0, and unfortunately you are already super short of space. Meaning you are even unable to touch an empty file. And in that case, if your kernel is not new enough, you can't even delete a file thanks to the metadata COW. So for common case, one can just treat global reserve as used metadata, unless used global reserve is not 0. [1] https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_GlobalReserve_and_why_does_.27btrfs_fi_df.27_show_it_as_single_even_on_RAID_filesystems.3F [2] https://btrfs.wiki.kernel.org/index.php/FAQ#if_your_device_is_large_.28.3E16GiB.29 Good, 5GiB freed space, it can be allocated for metadata to slightly reduce the metadata pressure. But not for long. The root resolve will be, add more space into this btrfs. Yes but this is a 128GB SSD and metadata could have been reallocated from some of the 25GB of free space allocated to data. This can only happens when: 1) All data chunk is balanced into super compact case, to free all the 25G Since btrfs store data and metadata into different chunks, one needs to use balance to free space from allocated data/metadata chunks. And in your case, you just tried dlimit=1 2 and 5, which will only free at most 8 chunks (and at most 8G space). If you want to free all the 25G free space from data chunks, then no dlimit at all. 2) Mixed block groups. This is the most straightforward case. All data and metadata can be stored into the same chunk. Then no such problem at all. But developers tends to avoid such behavior though. Even with a bigger drive, it is possible that chunks could be allocated to data, and then later operations requiring more metadata will still run out (running out of metadata space seems to be a reasonably common occurrence judging by the number of "why is btrfs reporting no space when I have space free" questions). This is true, and that's the long existing btrfs problem. Except balance and add more devices, there is no super good ideas so far. Maybe one day we can enhance it from the allocation algorithm. The file system shouldn't be corrupted when that happens. I'm sorry that I'm off topic for the GlobalReserve and unbalanced data/metadata chunk. But I don't consider the corruption is caused by unbalanced data/metadata chunks. So let's go back to the corruption case. Since you took the image of the corrupted fs, would you please try the following commands on the corrupted fs? $ btrfs-debug-tree -b 67239936 And, what the kernel mount option for the fs before crash? The kernel messages shows that your tree root is corrupted. This is common for a power loss. But the problem is, btrfs uses barrier to ensure superblock is written to disk *after* all other metadata committed. Or superblock is not updated and still points to old metadata, makes everything fine. So, either barrier is broken or you specified nobarrier, or the power loss directly corrupted the new tree root and magically makes the csum still match. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash, boot mount failure: "corrupt leaf, slot offset bad"
On Wed, Jan 06, 2016 at 08:57:28AM +0800, Qu Wenruo wrote: > > Since you took the image of the corrupted fs, would you please try the > following commands on the corrupted fs? > > $ btrfs-debug-tree -b 67239936 Command runs then segfaults: leaf 67239936 items 92 free space 9138 generation 276688 owner 2 fs uuid b1103526-98a3-4b40-a782-cf66721ed600 chunk uuid 16e767e3-a321-4d0f-9c72-6ebac9d305c4 item 0 key (61513990144 EXTENT_ITEM 16384) itemoff 16232 itemsize 51 extent refs 1 gen 276685 flags TREE_BLOCK tree block key (61617864704 EXTENT_ITEM 16384) level 0 tree block backref root 2 item 1 key (61514006528 EXTENT_ITEM 16384) itemoff 16181 itemsize 51 extent refs 1 gen 276285 flags TREE_BLOCK tree block key (27627 DIR_INDEX 3576) level 0 tree block backref root 260 item 2 key (61514022912 EXTENT_ITEM 16384) itemoff 16130 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99424354304) level 0 tree block backref root 7 item 3 key (61514039296 EXTENT_ITEM 16384) itemoff 16079 itemsize 51 extent refs 1 gen 275904 flags TREE_BLOCK tree block key (118 INODE_ITEM 0) level 0 tree block backref root 260 item 4 key (61514055680 EXTENT_ITEM 16384) itemoff 16028 itemsize 51 extent refs 1 gen 276685 flags TREE_BLOCK tree block key (61702111232 EXTENT_ITEM 16384) level 0 tree block backref root 2 item 5 key (61514088448 EXTENT_ITEM 16384) itemoff 15977 itemsize 51 extent refs 1 gen 276285 flags TREE_BLOCK tree block key (1906872 INODE_REF 34185) level 0 tree block backref root 260 item 6 key (61514104832 EXTENT_ITEM 16384) itemoff 15926 itemsize 51 extent refs 1 gen 276685 flags TREE_BLOCK tree block key (61741957120 EXTENT_ITEM 16384) level 0 tree block backref root 2 item 7 key (61514121216 EXTENT_ITEM 16384) itemoff 15875 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99654656000) level 0 tree block backref root 7 item 8 key (61514137600 EXTENT_ITEM 16384) itemoff 15824 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99620417536) level 0 tree block backref root 7 item 9 key (61514153984 EXTENT_ITEM 16384) itemoff 15773 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99669962752) level 0 tree block backref root 7 item 10 key (61514170368 EXTENT_ITEM 16384) itemoff 15722 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99639615488) level 0 tree block backref root 7 item 11 key (61514186752 EXTENT_ITEM 16384) itemoff 15671 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99681320960) level 0 tree block backref root 7 item 12 key (61514203136 EXTENT_ITEM 16384) itemoff 15620 itemsize 51 extent refs 1 gen 276285 flags TREE_BLOCK tree block key (882130 INODE_ITEM 0) level 0 tree block backref root 260 item 13 key (61514219520 EXTENT_ITEM 16384) itemoff 15569 itemsize 51 extent refs 1 gen 276685 flags TREE_BLOCK tree block key (61831168000 EXTENT_ITEM 16384) level 0 tree block backref root 2 item 14 key (61514268672 EXTENT_ITEM 16384) itemoff 15518 itemsize 51 extent refs 1 gen 275904 flags TREE_BLOCK tree block key (1553336 INODE_ITEM 0) level 0 tree block backref root 260 item 15 key (61514285056 EXTENT_ITEM 16384) itemoff 15467 itemsize 51 extent refs 1 gen 276685 flags TREE_BLOCK tree block key (62053400576 EXTENT_ITEM 16384) level 0 tree block backref root 2 item 16 key (61514334208 EXTENT_ITEM 16384) itemoff 15416 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99928444928) level 0 tree block backref root 7 item 17 key (61514350592 EXTENT_ITEM 16384) itemoff 15365 itemsize 51 extent refs 1 gen 266026 flags TREE_BLOCK tree block key (EXTENT_CSUM EXTENT_CSUM 99940794368) level 0 tree block backref root 7 item 18 key (61514366976 EXTENT_ITEM 16384) itemoff 15314 itemsize 51 extent refs 1
Re: Crash, boot mount failure: "corrupt leaf, slot offset bad"
Chris Bainbridge wrote on 2016/01/04 17:05 +: Kernel 4.4.0-rc7 System is Macbook with 109GB btrfs partition on SSD System crashed (nothing in syslog, could have been btrfs or possibly GPU fault shortly after running xrandr). After hard reset the btrfs partition was corrupt and would not mount. I took an image of the partition (some of the output below refers to sda5, some to loop0, it is the same image) # dmesg of mount failure (using kernel 4.2 from an Ubuntu 15.10 recovery drive but same on 4.4.0-rc7): [ 1969.425321] BTRFS critical (device sda5): corrupt leaf, slot offset bad: block=67239936,root=1, slot=82 [ 1969.428809] BTRFS critical (device sda5): corrupt leaf, slot offset bad: block=67239936,root=1, slot=82 [ 1969.431981] [ cut here ] [ 1969.432018] WARNING: CPU: 2 PID: 11162 at /build/linux-cRemOf/linux-4.2.0/fs/btrfs/extent-tree.c:6264 __btrfs_free_extent.isra.69+0x2ef/0xd70 [btrfs]() [ 1969.432028] BTRFS: Transaction aborted (error -5) [ 1969.432030] Modules linked in: drbg ansi_cprng ctr ccm intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp btrfs arc4 xor coretemp rt2800usb rt2x00usb raid6_pq b43 rt2800lib rt2x00lib kvm_intel mac80211 cfg80211 btusb btrtl btbcm btintel bluetooth uvcvideo kvm ssb crc_ccitt crct10dif_pclmul crc32_pclmul snd_hda_codec_hdmi snd_hda_codec_cirrus snd_hda_codec_generic videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev snd_hda_intel snd_hda_codec snd_hda_core aesni_intel aes_x86_64 lrw gf128mul glue_helper applesmc snd_hwdep snd_pcm snd_timer snd soundcore joydev input_polldev media ablk_helper cryptd bcm5974 input_leds lpc_ich bcma mei_me mei thunderbolt sbs sbshc acpi_als kfifo_buf apple_gmux industrialio mac_hid shpchp apple_bl autofs4 hid_generic i915 hid_apple sdhci_pci [ 1969.432115] i2c_algo_bit ahci uas drm_kms_helper libahci sdhci usb_storage usbhid drm video hid [ 1969.432132] CPU: 2 PID: 11162 Comm: mount Tainted: GW 4.2.0-22-generic #27-Ubuntu [ 1969.432136] Hardware name: Apple Inc. MacBookPro10,2/Mac-AFD8A9D944EA4843, BIOS MBP102.88Z.0106.B07.1501071215 01/07/2015 [ 1969.432139] 037e909d 88026267b5d8 817e94c9 [ 1969.432145] 88026267b630 88026267b618 8107b3d6 [ 1969.432152] 88026267b608 000e5297c000 fffb [ 1969.432157] Call Trace: [ 1969.432167] [] dump_stack+0x45/0x57 [ 1969.432175] [] warn_slowpath_common+0x86/0xc0 [ 1969.432181] [] warn_slowpath_fmt+0x55/0x70 [ 1969.432199] [] __btrfs_free_extent.isra.69+0x2ef/0xd70 [btrfs] [ 1969.432229] [] ? find_ref_head+0x5a/0x80 [btrfs] [ 1969.432248] [] __btrfs_run_delayed_refs+0x988/0x1080 [btrfs] [ 1969.432268] [] btrfs_run_delayed_refs.part.73+0x6e/0x270 [btrfs] [ 1969.432284] [] ? btrfs_set_path_blocking+0x43/0x80 [btrfs] [ 1969.432306] [] btrfs_run_delayed_refs+0x15/0x20 [btrfs] [ 1969.432326] [] btrfs_commit_transaction+0x56/0xb20 [btrfs] [ 1969.432332] [] ? kmem_cache_free+0x1cf/0x1e0 [ 1969.432356] [] btrfs_recover_log_trees+0x3ed/0x490 [btrfs] [ 1969.432378] [] ? replay_one_extent+0x6a0/0x6a0 [btrfs] [ 1969.432397] [] open_ctree+0x19b1/0x23e0 [btrfs] [ 1969.432412] [] btrfs_mount+0x94e/0xa70 [btrfs] [ 1969.432420] [] ? find_next_bit+0x15/0x20 [ 1969.432427] [] ? pcpu_alloc+0x385/0x670 [ 1969.432434] [] mount_fs+0x38/0x160 [ 1969.432439] [] ? __alloc_percpu+0x15/0x20 [ 1969.432446] [] vfs_kern_mount+0x6b/0x120 [ 1969.432463] [] btrfs_mount+0x1e8/0xa70 [btrfs] [ 1969.432469] [] ? pcpu_alloc+0x385/0x670 [ 1969.432475] [] mount_fs+0x38/0x160 [ 1969.432481] [] ? __alloc_percpu+0x15/0x20 [ 1969.432486] [] vfs_kern_mount+0x6b/0x120 [ 1969.432493] [] do_mount+0x246/0xd10 [ 1969.432498] [] ? strndup_user+0x4e/0xb0 [ 1969.432503] [] ? memdup_user+0x46/0x80 [ 1969.432510] [] SyS_mount+0x9f/0x100 [ 1969.432519] [] entry_SYSCALL_64_fastpath+0x16/0x75 [ 1969.432523] ---[ end trace 7a560cc73341e0d1 ]--- [ 1969.432528] BTRFS: error (device sda5) in __btrfs_free_extent:6264: errno=-5 IO failure [ 1969.436576] BTRFS: error (device sda5) in btrfs_run_delayed_refs:2781: errno=-5 IO failure [ 1969.442036] BTRFS: error (device sda5) in btrfs_replay_log:2375: errno=-5 IO failure (Failed to recover log tree) [ 1969.446087] BTRFS error (device sda5): cleaner transaction attach returned -30 [ 1969.486477] BTRFS: open_ctree failed # btrfsck: checking filesystem on sda5 UUID: b1103526-98a3-4b40-a782-cf66721ed600 checking extents incorrect offsets 11897 5713478 bad block 67239936 Errors found in extent allocation tree or chunk allocation checking free space cache There is no free space entry for 61515563008-62297997312 cache appears valid but isnt 61224255488 found 7839154285 bytes used err is -22 total csum bytes: 0 total tree bytes: 7454720 total fs tree bytes: 0 total extent tree bytes: 7356416 btree space waste bytes: 2454021 file data blocks allocated: 28508160 referenced 28508160
Crash, boot mount failure: "corrupt leaf, slot offset bad"
Kernel 4.4.0-rc7 System is Macbook with 109GB btrfs partition on SSD System crashed (nothing in syslog, could have been btrfs or possibly GPU fault shortly after running xrandr). After hard reset the btrfs partition was corrupt and would not mount. I took an image of the partition (some of the output below refers to sda5, some to loop0, it is the same image) # dmesg of mount failure (using kernel 4.2 from an Ubuntu 15.10 recovery drive but same on 4.4.0-rc7): [ 1969.425321] BTRFS critical (device sda5): corrupt leaf, slot offset bad: block=67239936,root=1, slot=82 [ 1969.428809] BTRFS critical (device sda5): corrupt leaf, slot offset bad: block=67239936,root=1, slot=82 [ 1969.431981] [ cut here ] [ 1969.432018] WARNING: CPU: 2 PID: 11162 at /build/linux-cRemOf/linux-4.2.0/fs/btrfs/extent-tree.c:6264 __btrfs_free_extent.isra.69+0x2ef/0xd70 [btrfs]() [ 1969.432028] BTRFS: Transaction aborted (error -5) [ 1969.432030] Modules linked in: drbg ansi_cprng ctr ccm intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp btrfs arc4 xor coretemp rt2800usb rt2x00usb raid6_pq b43 rt2800lib rt2x00lib kvm_intel mac80211 cfg80211 btusb btrtl btbcm btintel bluetooth uvcvideo kvm ssb crc_ccitt crct10dif_pclmul crc32_pclmul snd_hda_codec_hdmi snd_hda_codec_cirrus snd_hda_codec_generic videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev snd_hda_intel snd_hda_codec snd_hda_core aesni_intel aes_x86_64 lrw gf128mul glue_helper applesmc snd_hwdep snd_pcm snd_timer snd soundcore joydev input_polldev media ablk_helper cryptd bcm5974 input_leds lpc_ich bcma mei_me mei thunderbolt sbs sbshc acpi_als kfifo_buf apple_gmux industrialio mac_hid shpchp apple_bl autofs4 hid_generic i915 hid_apple sdhci_pci [ 1969.432115] i2c_algo_bit ahci uas drm_kms_helper libahci sdhci usb_storage usbhid drm video hid [ 1969.432132] CPU: 2 PID: 11162 Comm: mount Tainted: GW 4.2.0-22-generic #27-Ubuntu [ 1969.432136] Hardware name: Apple Inc. MacBookPro10,2/Mac-AFD8A9D944EA4843, BIOS MBP102.88Z.0106.B07.1501071215 01/07/2015 [ 1969.432139] 037e909d 88026267b5d8 817e94c9 [ 1969.432145] 88026267b630 88026267b618 8107b3d6 [ 1969.432152] 88026267b608 000e5297c000 fffb [ 1969.432157] Call Trace: [ 1969.432167] [] dump_stack+0x45/0x57 [ 1969.432175] [] warn_slowpath_common+0x86/0xc0 [ 1969.432181] [] warn_slowpath_fmt+0x55/0x70 [ 1969.432199] [] __btrfs_free_extent.isra.69+0x2ef/0xd70 [btrfs] [ 1969.432229] [] ? find_ref_head+0x5a/0x80 [btrfs] [ 1969.432248] [] __btrfs_run_delayed_refs+0x988/0x1080 [btrfs] [ 1969.432268] [] btrfs_run_delayed_refs.part.73+0x6e/0x270 [btrfs] [ 1969.432284] [] ? btrfs_set_path_blocking+0x43/0x80 [btrfs] [ 1969.432306] [] btrfs_run_delayed_refs+0x15/0x20 [btrfs] [ 1969.432326] [] btrfs_commit_transaction+0x56/0xb20 [btrfs] [ 1969.432332] [] ? kmem_cache_free+0x1cf/0x1e0 [ 1969.432356] [] btrfs_recover_log_trees+0x3ed/0x490 [btrfs] [ 1969.432378] [] ? replay_one_extent+0x6a0/0x6a0 [btrfs] [ 1969.432397] [] open_ctree+0x19b1/0x23e0 [btrfs] [ 1969.432412] [] btrfs_mount+0x94e/0xa70 [btrfs] [ 1969.432420] [] ? find_next_bit+0x15/0x20 [ 1969.432427] [] ? pcpu_alloc+0x385/0x670 [ 1969.432434] [] mount_fs+0x38/0x160 [ 1969.432439] [] ? __alloc_percpu+0x15/0x20 [ 1969.432446] [] vfs_kern_mount+0x6b/0x120 [ 1969.432463] [] btrfs_mount+0x1e8/0xa70 [btrfs] [ 1969.432469] [] ? pcpu_alloc+0x385/0x670 [ 1969.432475] [] mount_fs+0x38/0x160 [ 1969.432481] [] ? __alloc_percpu+0x15/0x20 [ 1969.432486] [] vfs_kern_mount+0x6b/0x120 [ 1969.432493] [] do_mount+0x246/0xd10 [ 1969.432498] [] ? strndup_user+0x4e/0xb0 [ 1969.432503] [] ? memdup_user+0x46/0x80 [ 1969.432510] [] SyS_mount+0x9f/0x100 [ 1969.432519] [] entry_SYSCALL_64_fastpath+0x16/0x75 [ 1969.432523] ---[ end trace 7a560cc73341e0d1 ]--- [ 1969.432528] BTRFS: error (device sda5) in __btrfs_free_extent:6264: errno=-5 IO failure [ 1969.436576] BTRFS: error (device sda5) in btrfs_run_delayed_refs:2781: errno=-5 IO failure [ 1969.442036] BTRFS: error (device sda5) in btrfs_replay_log:2375: errno=-5 IO failure (Failed to recover log tree) [ 1969.446087] BTRFS error (device sda5): cleaner transaction attach returned -30 [ 1969.486477] BTRFS: open_ctree failed # btrfsck: checking filesystem on sda5 UUID: b1103526-98a3-4b40-a782-cf66721ed600 checking extents incorrect offsets 11897 5713478 bad block 67239936 Errors found in extent allocation tree or chunk allocation checking free space cache There is no free space entry for 61515563008-62297997312 cache appears valid but isnt 61224255488 found 7839154285 bytes used err is -22 total csum bytes: 0 total tree bytes: 7454720 total fs tree bytes: 0 total extent tree bytes: 7356416 btree space waste bytes: 2454021 file data blocks allocated: 28508160 referenced 28508160 # btrfs-image (compiled v4.3.1 from git:.../kdave
Re: Can't mount btrfs: corrupt leaf, slot offset bad
On Tue, Oct 13, 2015 at 06:25:54PM -0500, EJ Parker wrote: > I rebooted my server last night and discovered that my btrfs > filesystem (3 disk raid1) would not mount anymore. After doing some > research and getting nowhere I went to IRC and user darkling asked me > a few questions and asked for output of btrfs-debug-tree and > ultimately sent me here saying I should include a handful of things: > > Before I go further, let's get required info out of the way: > > uname -a: > Linux archhost1 4.2.3-1-ARCH #1 SMP PREEMPT Sat Oct 3 18:52:50 > CEST 2015 x86_64 GNU/Linux > btrfs --version: > btrfs-progs v4.2.1 > output from "btrfs fi show": > Label: none uuid: 5470630f-39f4-4d39-90a2-277d7991722a > Total devices 3 FS bytes used 3.10TiB > devid1 size 3.64TiB used 2.12TiB path /dev/sdd > devid2 size 3.64TiB used 2.12TiB path /dev/sde > devid3 size 3.64TiB used 2.12TiB path /dev/sdc > > First, I am able to mount with -o ro,recovery, but not with just -o > recovery. When I attempt to mount w/o ro, I get this in dmesg: [snip] > darklink also mentioned that btrfs-zero-log might help too, but that I > should get confirmation from one of the devs on that. I suggested the zero-log might work because the FS is mountable with -o ro, but not without, which suggests a corrupt log. However, it's not obvious to me what tree the corruption is in, and whether zeroing the log might actually hurt the recovery process. Hugo. -- Hugo Mills | There's an infinite number of monkeys outside who hugo@... carfax.org.uk | want to talk to us about this new script for Hamlet http://carfax.org.uk/ | they've worked out! PGP: E2AB1DE4 | Arthur Dent signature.asc Description: Digital signature
Can't mount btrfs: corrupt leaf, slot offset bad
I rebooted my server last night and discovered that my btrfs filesystem (3 disk raid1) would not mount anymore. After doing some research and getting nowhere I went to IRC and user darkling asked me a few questions and asked for output of btrfs-debug-tree and ultimately sent me here saying I should include a handful of things: Before I go further, let's get required info out of the way: uname -a: Linux archhost1 4.2.3-1-ARCH #1 SMP PREEMPT Sat Oct 3 18:52:50 CEST 2015 x86_64 GNU/Linux btrfs --version: btrfs-progs v4.2.1 output from "btrfs fi show": Label: none uuid: 5470630f-39f4-4d39-90a2-277d7991722a Total devices 3 FS bytes used 3.10TiB devid1 size 3.64TiB used 2.12TiB path /dev/sdd devid2 size 3.64TiB used 2.12TiB path /dev/sde devid3 size 3.64TiB used 2.12TiB path /dev/sdc First, I am able to mount with -o ro,recovery, but not with just -o recovery. When I attempt to mount w/o ro, I get this in dmesg: [44478.800613] BTRFS critical (device sde): corrupt leaf, slot offset bad: block=5674754899968,root=1, slot=147 [44478.802489] BTRFS critical (device sde): corrupt leaf, slot offset bad: block=5674754899968,root=1, slot=147 [44478.804072] BTRFS error (device sde): Error removing orphan entry, stopping orphan cleanup [44478.805856] BTRFS error (device sde): could not do orphan cleanup -22 [44482.635498] BTRFS: open_ctree failed Running "btrfs-debug-tree -b 5674754899968 /dev/sde" gave me this: leaf 5674754899968 items 207 free space 30 generation 884595 owner 5 fs uuid 5470630f-39f4-4d39-90a2-277d7991722a chunk uuid c269615e-7397-41bc-95d0-dfdb2a696b23 [...] item 145 key (273094 EXTENT_DATA 364924928) itemoff 8545 itemsize 53 extent data disk byte 8658465382400 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 item 146 key (273094 EXTENT_DATA 364929024) itemoff 8492 itemsize 53 extent data disk byte 8658465378304 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 item 147 key (273094 EXTENT_DATA 364933120) itemoff 8439 itemsize 53 extent data disk byte 8677950173184 nr 24576 extent data offset 0 nr 20480 ram 24576 extent compression 0 item 148 key (273094 EXTENT_DATA 364953600) itemoff 8333 itemsize 53 extent data disk byte 8677990363136 nr 20480 extent data offset 0 nr 16384 ram 20480 extent compression 0 item 149 key (273094 EXTENT_DATA 364957696) itemoff 8386 itemsize 53 extent data disk byte 0 nr 0 extent data offset 0 nr 18446744073709514752 ram 18446744073709514752 extent compression 0 item 150 key (273094 EXTENT_DATA 364969984) itemoff 8280 itemsize 53 extent data disk byte 8678063341568 nr 20480 extent data offset 0 nr 16384 ram 20480 extent compression 0 item 151 key (273094 EXTENT_DATA 365002752) itemoff 8227 itemsize 53 extent data disk byte 8678025232384 nr 36864 extent data offset 0 nr 32768 ram 36864 extent compression 0 item 152 key (273094 EXTENT_DATA 365019136) itemoff 8174 itemsize 53 extent data disk byte 8678112104448 nr 36864 extent data offset 0 nr 32768 ram 36864 extent compression 0 item 153 key (273094 EXTENT_DATA 365051904) itemoff 8121 itemsize 53 extent data disk byte 8678052835328 nr 53248 extent data offset 0 nr 49152 ram 53248 extent compression 0 item 154 key (273094 EXTENT_DATA 365101056) itemoff 8068 itemsize 53 extent data disk byte 8678090510336 nr 20480 extent data offset 0 nr 16384 ram 20480 extent compression 0 item 155 key (273094 EXTENT_DATA 365117440) itemoff 8015 itemsize 53 extent data disk byte 8678117130240 nr 20480 extent data offset 0 nr 16384 ram 20480 extent compression 0 [...] Output from "btrfs check --readonly /dev/sde": Checking filesystem on /dev/sde UUID: 5470630f-39f4-4d39-90a2-277d7991722a checking extents incorrect offsets 8439 8386 bad block 5674754899968 Errors found in extent allocation tree or chunk allocation checking free space cache checking fs roots Output from (failed) "btrfs check --repair /dev/sdc" (which I tried prior to seeking help): enabling repair mode Checking filesystem on /dev/sdc UUID: 5470630f-39f4-4d39-90a2-277d7991722a checking extents incorrect offsets 8439 8386 shifting item nr 148 by bytes in block 5674754899968 items overlap, can't fix cmds-check.c:4059: fix_item_offset: Assertion `ret` failed. darklink also mentioned that btrfs-zero-log might help too, but that I shou
Re: Bug: corrupt leaf. slot offset bad: root subvolume unmountable, btrfs check crashes
Ah. Thank you for the replies. I didn't get them as mails and spinics didn't update the thread until yesterday. So I take it that the recommended course of action is not to wait for any more or less unlikely btrfs-progs fix, but to try --repair and be ready to restore from backup, too. Darn, and that over what probably doesn't amount to more than a few dozen KB. Wish I could simply replace the single subvolume instead, but I suppose that's one of btrfs's drawbacks. I did a full partition backup some three weeks ago, so I'll have to spend some hours to figure out what has changed since then, and how to do incremental backups of it to different devices for the next timeā¦ I don't have the time atm though; it'll probably take at least a week (unless the partition decides to die) to report back. As a side note, there was an ostensibly similar issue fixed in 2012: https://bugzilla.novell.com/show_bug.cgi?id=760279 Guess that was a different underlying issue, though. Duncan posted on Wed, 23 Apr 2014 02:55:36 +: Andreas Reis posted on Tue, 22 Apr 2014 20:16:13 +0200 as excerpted: Same failure with btrfs-progs from integration-20140421 (apart from the line number 1156). Can I get a bit of input on this? Is it safe to just ignore the error for now (as I'm doing atm), ie. remount as rw to skip the orphan cleanup? I explained orphans in my other reply. Since they're simply not yet completed file deletions, it should be /relatively/ safe to continue ignoring and doing the manual remount rw, since that continues to kwork. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug: corrupt leaf. slot offset bad: root subvolume unmountable, btrfs check crashes
Same failure with btrfs-progs from integration-20140421 (apart from the line number 1156). Can I get a bit of input on this? Is it safe to just ignore the error for now (as I'm doing atm), ie. remount as rw to skip the orphan cleanup? Might it even be safe to call btrfs check --repair on the partition? I'm not keen on that failing mid-process at the same assertion and thus breaking it over a bunch of minor files, just like it happened with my previous btrfs partitions. On 21.04.2014 21:13, Andreas Reis wrote: Alright, turns out the partition does actually mount on 3.15-rc2 (error messages remain, of course). But systemd will fail to continue booting as /bin/mount returns exit status 32 and / thus ends as ro, yet can be manually remounted as rw. Another error message I've spotted with 3.15 is BTRFS error (device sdc5): error loading props for ino 1810424 (root 257): -5 I've now tried to mount with -o recovery and clear_cache, no effect. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug: corrupt leaf. slot offset bad: root subvolume unmountable, btrfs check crashes
Andreas Reis posted on Tue, 22 Apr 2014 20:16:13 +0200 as excerpted: Same failure with btrfs-progs from integration-20140421 (apart from the line number 1156). Can I get a bit of input on this? Is it safe to just ignore the error for now (as I'm doing atm), ie. remount as rw to skip the orphan cleanup? I explained orphans in my other reply. Since they're simply not yet completed file deletions, it should be /relatively/ safe to continue ignoring and doing the manual remount rw, since that continues to work. Relatively as in that's what I'd do in the shorter term here were I seeing the problem, tho I'd ensure my backups were current and tested, as should be the case on btrfs anyway since it's not entirely stable yet, and just because I don't like nagging half-dealt-with-problems left laying around and the error would eat at me until I'd cleared it, at some point likely rather sooner than later, I'd very likely mkfs and restore from those backups. But I'd certainly be willing to continue running from the partition short term, for a week or so until I had a chance to do the mkfs.btrfs and restore from backup, as long as that remained the only issue I was seeing. Might it even be safe to call btrfs check --repair on the partition? I'm not keen on that failing mid-process at the same assertion and thus breaking it over a bunch of minor files, just like it happened with my previous btrfs partitions. That I can't say. Based on reports and the common knowledge of the list, I've become rather leery of btrfs check --repair myself, and tend to rely on scrub and balance to fix issues if they can, and beyond that, mkfs.btrfs and restore from backup. In fact, while btrfs check without the --repair is safe as it's read-only, I don't run it regularly either, because I know should it report problems I'd then be worried about things I might have no reasonable way to fix, that obviously aren't causing me problems anyway. Basically, if mounting and regular use of the filesystem isn't giving me anything unusual in dmesg, I consider it good, and I for the most part I tend to route around btrfs check entirely, as if it weren't even there, tho I've run it in default read-only mode a few times, to compare my output with a post from the list or something, always with a clean bill of health from btrfs check when I have run it. That said, if you have backups tested and ready anyway, and would otherwise be doing a mkfs.btrfs in short order in ordered to get rid of those bad orphan warnings anyway, I don't see the harm in running it, since at that point it's zero risk anyway. If you lose the filesystem as a result, big deal, as you were going to mkfs.btrfs and restore from backup anyway, and if it fixes the problem, well, you saved yourself the hassle. Plus, either way you can report back the results and then we'll know whether it's safe to recommend btrfs check for the next report, or not. =:^) -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Bug: corrupt leaf. slot offset bad: root subvolume unmountable, btrfs check crashes
Kernel 3.15.0-rc2, btrfs-progs 3.14.1 While doing some minor package updates my btrfs root partition [*] decided to corrupt itself. There was no system crash, although I had plenty of these (due to an USB-related regression) in recent weeks that resulted in no trouble. First only one of a package's folders was corrupted, any access to files within (incl. attempts to delete) printed btrfs: corrupt leaf, slot offset bad: block=842924032,root=1, slot=88 to dmesg (I'm actually not sure about the numbers, but that was indeed the error message). After moving the folder out of the way the partition continued to appear working as normal, one reboot also worked fine. Now I can't boot at all (beyond loading the kernel image located on another partition), neither with 3,15-rc2 nor 3.14.1. Attempting to mount the __current/ROOT subvolume on ArchLinux's current Live-CD (kernel 3.13.7) prints btrfs: device label Linux devid 1 transid 55586 /dev/sdc5 btrfs: use ssd allocation scheme btrfs: disk space caching is enabled btrfs: checking UUID tree btrfs: corrupt leaf, slot offset bad: block=842924032,root=1, slot=88 btrfs: corrupt leaf, slot offset bad: block=842924032,root=1, slot=88 BTRFS error (device sdc5): Error removing orphan entry, stopping orphan cleanup BTRFS critical (device sdc5): could not do orphan cleanup -22 Doing btrfs check /dev/sdc5 merely first prints ten free space inode generation (0) did not match free space cache generation ([different transids between 40010 and 55578]) to then abort with checking fs roots btrfs: cmds-check.c:1151: procecss_file_extent: Assertion `!(rec-ino != key-objectid || rec-refs 1)' failed. I'm reluctant to try any of btrfs check options (or mount with -o recovery) since the last three times I did this (with other partitions) it resulted in the partition becoming entirely trashed, while before at least btrfs restore still managed to extract some data each time. The affected folder was one within /usr/include/qt4 (which I then moved to /usr/BROKEN, to successfully reinstall the package), ie. on the __current/ROOT subvolume. Which seems the only subvolume affected (yet). Mounting accessing the other three (__current/{var,home,opt}) still works. [*] Organised following http://blog.fabio.mancinelli.me/2012/12/28/Arch_Linux_on_BTRFS.html (Also posted on https://bugzilla.kernel.org/show_bug.cgi?id=74611 ) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug: corrupt leaf. slot offset bad: root subvolume unmountable, btrfs check crashes
Alright, turns out the partition does actually mount on 3.15-rc2 (error messages remain, of course). But systemd will fail to continue booting as /bin/mount returns exit status 32 and / thus ends as ro, yet can be manually remounted as rw. Another error message I've spotted with 3.15 is BTRFS error (device sdc5): error loading props for ino 1810424 (root 257): -5 I've now tried to mount with -o recovery and clear_cache, no effect. On 21.04.2014 18:16, Andreas Reis wrote: Kernel 3.15.0-rc2, btrfs-progs 3.14.1 While doing some minor package updates my btrfs root partition [*] decided to corrupt itself. There was no system crash, although I had plenty of these (due to an USB-related regression) in recent weeks that resulted in no trouble. First only one of a package's folders was corrupted, any access to files within (incl. attempts to delete) printed btrfs: corrupt leaf, slot offset bad: block=842924032,root=1, slot=88 to dmesg (I'm actually not sure about the numbers, but that was indeed the error message). After moving the folder out of the way the partition continued to appear working as normal, one reboot also worked fine. Now I can't boot at all (beyond loading the kernel image located on another partition), neither with 3,15-rc2 nor 3.14.1. Attempting to mount the __current/ROOT subvolume on ArchLinux's current Live-CD (kernel 3.13.7) prints btrfs: device label Linux devid 1 transid 55586 /dev/sdc5 btrfs: use ssd allocation scheme btrfs: disk space caching is enabled btrfs: checking UUID tree btrfs: corrupt leaf, slot offset bad: block=842924032,root=1, slot=88 btrfs: corrupt leaf, slot offset bad: block=842924032,root=1, slot=88 BTRFS error (device sdc5): Error removing orphan entry, stopping orphan cleanup BTRFS critical (device sdc5): could not do orphan cleanup -22 Doing btrfs check /dev/sdc5 merely first prints ten free space inode generation (0) did not match free space cache generation ([different transids between 40010 and 55578]) to then abort with checking fs roots btrfs: cmds-check.c:1151: procecss_file_extent: Assertion `!(rec-ino != key-objectid || rec-refs 1)' failed. I'm reluctant to try any of btrfs check options (or mount with -o recovery) since the last three times I did this (with other partitions) it resulted in the partition becoming entirely trashed, while before at least btrfs restore still managed to extract some data each time. The affected folder was one within /usr/include/qt4 (which I then moved to /usr/BROKEN, to successfully reinstall the package), ie. on the __current/ROOT subvolume. Which seems the only subvolume affected (yet). Mounting accessing the other three (__current/{var,home,opt}) still works. [*] Organised following http://blog.fabio.mancinelli.me/2012/12/28/Arch_Linux_on_BTRFS.html (Also posted on https://bugzilla.kernel.org/show_bug.cgi?id=74611 ) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug: corrupt leaf. slot offset bad: root subvolume unmountable, btrfs check crashes
Andreas Reis posted on Mon, 21 Apr 2014 21:13:16 +0200 as excerpted: Alright, turns out the partition does actually mount on 3.15-rc2 (error messages remain, of course). But systemd will fail to continue booting as /bin/mount returns exit status 32 and / thus ends as ro, yet can be manually remounted as rw. The mount manpage says status 32 is mount failure. Dmesg should contain more, but that's probably the errors you already mentioned. So you're getting the read-only mount, but can't remount rw. (This doesn't apply in your case, but FWIW, I now have my root filesystem setup to be ro mounted by default, and have been running that way for some months, now. Seems safer that way. The only time I remount / rw is when I'm updating the system or changing something in the config, then I normally remount ro again, altho after updating the system I normally have to exit and restart X and kde as well as various system services before I can remount ro, depending on what libraries got changed out from under my running processes. Of course in ordered to make this work a few /var/ subdirs that need to be writable are actually symlinks to /home/var/ subdirs, /var/log is a dedicated writable logging partition of its own, etc. So a read-only rootfs is the /normal/ case for me, and wouldn't interfere with normal operations at all. =:^) Another error message I've spotted with 3.15 is BTRFS error (device sdc5): error loading props for ino 1810424 (root 257): -5 That would be one of the new btrfs properties introduced in kernel 3.14. See btrfs property list/get/set... Unless you've set individual file properties (such as compress), that's probably a property (such as ro/rw) on a subvolume, or possibly on the main filesystem (label, etc). Meanwhile, orphans normally refer to files that are deleted while they're still in use. Normally, these will be libraries, etc, replaced during a system upgrade, but still in use by running programs. Once all such running programs have been restarted (loading the new version of the library) or terminated, the filesystem can be unmounted or remounted read- only. In the event they're not fully cleaned up at umount time, they are normally cleaned up after reboot, when a filesystem is first mounted writable once again. Obviously there's a problem with one of these orphans, and attempts to clean it up are failing, causing the remount rw to fail. While that doesn't help with fixing the problem, it should at least give you some idea of what's going on, and how to interpret the messages and errors you see. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html