Re: Poll: time to switch skinny-metadata on by default?
Am Mon, 20 Oct 2014 18:34:03 +0200 schrieb David Sterba : > On Thu, Oct 16, 2014 at 01:33:37PM +0200, David Sterba wrote: > > I'd like to make it default with the 3.17 release of btrfs-progs. > > Please let me know if you have objections. > > For the record, 3.17 will not change the defaults. The timing of the > poll was very bad to get enough feedback before the release. Let's keep > it open for now. Two points: First of all: does grub2 support booting from a btrfs file system with skinny-metadata, or is it irrelevant? And secondly, I've gotten a BUG after trying to convert my external backup partition to skinny-metadata (the same one from the bug report mentioned previously in this thread, I believe). Below is a more detailed account. First of all, my setup (as of *now*, not before the BUG): # btrfs filesystem show Label: none uuid: 0267d8b3-a074-460a-832d-5d5fd36bae64 Total devices 1 FS bytes used 41.42GiB devid1 size 107.79GiB used 53.06GiB path /dev/sdf1 Label: 'MARCEC_STORAGE' uuid: 472c9290-3ff2-4096-9c47-0612d3a52cef Total devices 4 FS bytes used 514.54GiB devid1 size 298.09GiB used 259.03GiB path /dev/sda devid2 size 298.09GiB used 259.03GiB path /dev/sdb devid3 size 298.09GiB used 259.03GiB path /dev/sdc devid4 size 298.09GiB used 259.03GiB path /dev/sdd Label: 'MARCEC_BACKUP' uuid: f97b3cda-15e8-418b-bb9b-235391ef2a38 Total devices 1 FS bytes used 169.31GiB devid1 size 976.56GiB used 175.06GiB path /dev/sdg2 Btrfs v3.17 # btrfs filesystem df / Data, single: total=48.00GiB, used=39.94GiB System, DUP: total=32.00MiB, used=12.00KiB Metadata, DUP: total=2.50GiB, used=1.48GiB GlobalReserve, single: total=508.00MiB, used=0.00B # btrfs filesystem df /home Data, RAID10: total=516.00GiB, used=513.38GiB System, RAID10: total=64.00MiB, used=96.00KiB Metadata, RAID10: total=2.00GiB, used=1.16GiB GlobalReserve, single: total=400.00MiB, used=0.00B # btrfs filesystem df /media/MARCEC_BACKUP Data, single: total=167.00GiB, used=166.53GiB System, DUP: total=32.00MiB, used=28.00KiB Metadata, DUP: total=4.00GiB, used=2.79GiB GlobalReserve, single: total=512.00MiB, used=1.33MiB # uname -a Linux marcec 3.16.6-gentoo #1 SMP PREEMPT Fri Oct 24 01:06:49 CEST 2014 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux # btrfs --version Btrfs v3.17 Now, what I was trying to do - motivated by this thread - was convert /home and /media/MARCEC_BACKUP to skinny-metadata, using "btrfstune -x". That in itself worked fine, and the MARCEC_BACKUP has since seen filesystem activity (running rsync, creating and deleting snapshots). *Then* I started a "btrfs balance -m" on /home (which completed without errors) and then on /media/MARCEC_BACKUP, which is when the BUG happened (dmesg output see below). The result in user-space was that "btrfs balance" SEGFAULTed. "btrfs balance status" showed the balance still running, so I tried to cancel it, which ended up hanging (the btrfs program has yet to return back to the shell). For some reason I tried running "sync" (as root), which has also hung in the same way. I can still access files on MARCEC_BACKUP just fine, and the snapshots are still there ("btrfs subvolume list" succeeds). Is there anything else I can do, or any other information you might need? dmesg output (starting with the start of the balance) [ 4651.448883] BTRFS info (device sdb): relocating block group 1492765376512 flags 66 [ 4652.259501] BTRFS info (device sdb): found 2 extents [ 4652.987753] BTRFS info (device sdb): relocating block group 1491691634688 flags 68 [ 4688.655390] BTRFS info (device sdb): found 13744 extents [ 4689.382109] BTRFS info (device sdb): relocating block group 1485249183744 flags 68 [ 4753.879520] BTRFS info (device sdb): found 62519 extents [ 4791.123268] BTRFS info (device sdg2): relocating block group 2499670966272 flags 36 [ 4830.811665] BTRFS info (device sdg2): found 1793 extents [ 4831.240909] BTRFS info (device sdg2): relocating block group 2499134095360 flags 36 [ 5407.582370] BTRFS info (device sdg2): found 51182 extents [ 5407.959115] BTRFS info (device sdg2): relocating block group 2498597224448 flags 36 [ 5724.487824] BTRFS info (device sdg2): found 51435 extents [ 5725.006401] BTRFS info (device sdg2): relocating block group 2473867608064 flags 34 [ 5725.817513] BTRFS info (device sdg2): found 7 extents [ 5726.328413] BTRFS info (device sdg2): relocating block group 2469002215424 flags 36 [ 5844.148295] [ cut here ] [ 5844.148307] WARNING: CPU: 1 PID: 7270 at fs/btrfs/extent-tree.c:876 btrfs_lookup_extent_info+0x48c/0x4c0() [ 5844.148308] Modules linked in: uas usb_storage joydev hid_logitech_dj bridge stp llc ipt_REJECT xt_tcpudp iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_n
Fw: Heavy nocow'd VM image fragmentation
Begin forwarded message (forgot to send to list): Date: Sat, 25 Oct 2014 03:57:41 -0700 From: Duncan <1i5t5.dun...@cox.net> To: Marc MERLIN Subject: Re: Heavy nocow'd VM image fragmentation On Fri, 24 Oct 2014 21:48:56 -0700 Marc MERLIN wrote: > On Oct 25, 2014 11:28 AM, "Duncan" <1i5t5.dun...@cox.net> wrote: > > > Yes, but the OP said he hadn't snapshotted since creating the file, > > and MM's a regular that actually wrote much of the wiki > > documentation on raid56 modes, so he better know about the > > snapshotting problem too. > > Yes and no. I use btrfs send receive, so I have to use snapshots on > the subvolume my VM file is on. That kinds screws things, since you can delete the snapshots afterward, but if anything changed while it was there it still forces a 1cow on it. As long as the send doesn't take "forever" the time in question can be reasonably short, but if the VM must remain active over that period, there's likely to still be /some/ effect. The only thing you can do about that, I guess, is periodically defrag them, but of course without snapshot-aware-defrag that breaks any snapshot sharing, multiplying the space required. And with send/receive requiring a reference snapshot for incrementals, there's the snapshot, thus space-doubling on anything actually defragged is unfortunately a given. =:^( > Can you reply to this post to show my reply to others since my > Android post to the list will get rejected? Dug out of the trash here too due to the HTML, but OK... -- Duncan - HTML messages treated as spam "They that can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety." -- Benjamin Franklin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poll: time to switch skinny-metadata on by default?
Am Sat, 25 Oct 2014 14:24:58 +0200 schrieb Marc Joliet : > I can still access files on MARCEC_BACKUP just fine, and the snapshots are > still there ("btrfs subvolume list" succeeds). Just an update: that was true for a while, but at one point listing directories and accessing the file system in general stopped working (all processes that touched the FS hung/zombified). This necessitated a hard reboot, since "reboot" and "halt" (so... "shutdown", really) didn't do anything other than spit out the usual "the system is rebooting" message. Interestingly enough, the file system was (apparently) fine after that (just as Petr Janecek wrote), other than an invalid space cache file: [ 65.477006] BTRFS info (device sdg2): The free space cache file (2466854731776) is invalid. skip it That is, running my backup routine worked just as before, and I can access files on the FS just fine. Oh, and apparently the rebalance continued successfully?! [ 342.540865] BTRFS info (device sdg2): continuing balance [ 342.51] BTRFS info (device sdg2): relocating block group 2502355320832 flags 34 [ 342.821608] BTRFS info (device sdg2): found 4 extents [ 343.056915] BTRFS info (device sdg2): relocating block group 2501818449920 flags 36 [ 437.932405] BTRFS info (device sdg2): found 25086 extents [ 438.727197] BTRFS info (device sdg2): relocating block group 2501281579008 flags 36 [ 557.319354] BTRFS info (device sdg2): found 83875 extents # btrfs balance status /media/MARCEC_BACKUP No balance found on '/media/MARCEC_BACKUP' No SEGFAULT anywhere. All I can say right now is "huh". Although I'll try starting a "balance -m" again tomorrow, because the continued balance only took about 3-4 minutes (maybe it . HTH -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: PGP signature
Re: Poll: time to switch skinny-metadata on by default?
On Oct 25, 2014, at 6:24 AM, Marc Joliet wrote: > > First of all: does grub2 support booting from a btrfs file system with > skinny-metadata, or is it irrelevant? Seems plausible if older kernels don't understand skinny-metadata, that GRUB2 won't either. So I just tested it with grub2-2.02-0.8.fc21 and it works. I'm surprised, actually. The way I did this was creating a whole new fs with -Oskinny-metadata and using btrfs send receive to copy an existing system over. Kernel reports at boot time that the volume uses skinny extents. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poll: time to switch skinny-metadata on by default?
On Oct 25, 2014, at 2:33 PM, Chris Murphy wrote: > > On Oct 25, 2014, at 6:24 AM, Marc Joliet wrote: >> >> First of all: does grub2 support booting from a btrfs file system with >> skinny-metadata, or is it irrelevant? > > Seems plausible if older kernels don't understand skinny-metadata, that GRUB2 > won't either. So I just tested it with grub2-2.02-0.8.fc21 and it works. I'm > surprised, actually. I don't understand the nature of the incompatibility with older kernels. Can they not mount a Btrfs volume even as ro? If so then I'd expect GRUB to have a problem, so I'm going to guess that maybe a 3.9 or older kernel could ro mount a Btrfs volume with skinny extents and the incompatibility is writing. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs balance segfault, kernel BUG at fs/btrfs/extent-tree.c:7727
On Mon, Oct 13, 2014 at 11:12 AM, Rich Freeman wrote: > On Thu, Oct 9, 2014 at 10:19 AM, Petr Janecek wrote: >> >> I have trouble finishing btrfs balance on five disk raid10 fs. >> I added a disk to 4x3TB raid10 fs and run "btrfs balance start >> /mnt/b3", which segfaulted after few hours, probably because of the BUG >> below. "btrfs check" does not find any errors, both before the balance >> and after reboot (the fs becomes un-umountable). >> >> [22744.238559] WARNING: CPU: 0 PID: 4211 at fs/btrfs/extent-tree.c:876 >> btrfs_lookup_extent_info+0x292/0x30a [btrfs]() >> >> [22744.532378] kernel BUG at fs/btrfs/extent-tree.c:7727! > > I am running into something similar. I just added a 3TB drive to my > raid1 btrfs and started a balance. The balance segfaulted, and I find > this in dmesg: I got another one of these crashes during a balance today, and this is on 3.17.1 with the "Btrfs: race free update of commit root for ro snapshots" patch. So, there is something else in 3.17.1 that causes this problem. I did see mention of an extent error of some kind on the lists and I don't have that patch - I believe it is planned for 3.17.2. After the crash the filesystem became read-only. I didn't have any way to easily capture the logs, but I got repeated crashes when trying to re-mount the filesystem after rebooting. The dmesg log showed read errors from one of the devices (bdev /dev/sdb2 errs: wr 0, rd 1361, flush 0, corrupt 0, gen 0). When I tried to btrfs check the filesystem with btrfs-progs 3.17 it abruptly terminated and output an error mentioning could not find extent items followed by root and a really large number. I finally managed to recover by mounting the device with skip_balance - I suspect that it was crashing due to attempts to restart the failing balance. Then after letting the filesystem settle down I unmounted it cleanly and rebooted and everything was back to normal. However, i'm still getting "bdev /dev/sdb2 errs: wr 0, rd 1361, flush 0, corrupt 0, gen 0" in my dmesg logs. I have tried scrubbing the device with no errors found. -- Rich -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs balance segfault, kernel BUG at fs/btrfs/extent-tree.c:7727
On Sat, 25 Oct 2014 09:41:27 PM Rich Freeman wrote: > So, there is something else in 3.17.1 that causes > this problem. I did see mention of an extent error of some kind on > the lists and I don't have that patch - I believe it is planned for > 3.17.2. There are currently 13 patches for btrfs queued for 3.17.2: queue-3.17/btrfs-add-missing-compression-property-remove-in-btrfs_ioctl_setflags.patch queue-3.17/btrfs-cleanup-error-handling-in-build_backref_tree.patch queue-3.17/btrfs-don-t-do-async-reclaim-during-log-replay.patch queue-3.17/btrfs-don-t-go-readonly-on-existing-qgroup-items.patch queue-3.17/btrfs-fix-a-deadlock-in-btrfs_dev_replace_finishing.patch queue-3.17/btrfs-fix-and-enhance-merge_extent_mapping-to-insert-best-fitted-extent-map.patch queue-3.17/btrfs-fix-build_backref_tree-issue-with-multiple-shared-blocks.patch queue-3.17/btrfs-fix-race-in-wait_sync-ioctl.patch queue-3.17/btrfs-fix-the-wrong-condition-judgment-about-subset-extent-map.patch queue-3.17/btrfs-fix-up-bounds-checking-in-lseek.patch queue-3.17/btrfs-try-not-to-enospc-on-log-replay.patch queue-3.17/btrfs-wake-up-transaction-thread-from-sync_fs-ioctl.patch queue-3.17/revert-btrfs-race-free-update-of-commit-root-for-ro-snapshots.patch You can grab them here: http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.17 Hope this helps! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs balance segfault, kernel BUG at fs/btrfs/extent-tree.c:7727
Rich Freeman posted on Sat, 25 Oct 2014 21:41:27 -0400 as excerpted: > However, i'm still getting "bdev /dev/sdb2 errs: wr 0, rd 1361, flush 0, > corrupt 0, gen 0" in my dmesg logs. I have tried scrubbing the device > with no errors found. Note that error counts do /not/ reset at boot. The counts are therefore since either the last mkfs, or the last time the error counts were reset manually, and if you know you've had errors (as you did here), all you need to do is take note of the count and ensure it's not increasing unexpectedly. Meanwhile, btrfs device stats can be used to print the error counts on demand and its -z option resets them after that print, thus being the manual reset I mentioned above. So chances are those read errors are the same ones you had previously. As long as the number isn't increasing, you're not registering any further errors. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html