Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 2:32 PM, Timothy Normand Miller theo...@gmail.com wrote: If I lose the array, I won't cry. The backup appears to be complete. But it would be convenient to avoid having to restore from scratch, and I'm hoping this might help you guys too in some way. I really like btrfs, and I would like provide you with whatever info might contribute something. Well it seems fine if it mounts rw,degraded. Just do a 'btrfs replace start...' with a new drive. Or if you're going to try the old drive that's failing, good luck with that. You might want to at least zero out all the superblocks. Check the wiki for their location, wipefs only removes the signature of the 1st superblock. I don't know if that's enough for the purposes of a btrfs replace start (probably is but I haven't tested it). But then you need to fix this nodatacow thing by not using it as a mount option, and setting it as a subvolume or directory option with chattr +C. That way everything else is checksummed. Then you will use btrfs check --init-csum-tree to compute checksums for everything that right now have none due to nodatacow. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 3:00 PM, Timothy Normand Miller theo...@gmail.com wrote: On Tue, Aug 11, 2015 at 4:48 PM, Chris Murphy li...@colorremedies.com wrote: The compress is ignored, and it looks like nodatasum and nodatacow apply to everything. The nodatasum means no raid1 self-healing is possible for any data on the entire volume. Metadata checksumming is still enabled. Ugh. So I need to change my fstab file. I swear, some expert on IRC told me that this should work fine, which is why I did it. In fact, I think they recommended it on the basis that I wanted to put VM images on one of the subvolumes. This discussion occurred a long time ago, well before RAID5 was even partially implemented. There is still data redundancy. Will a scrub at least notice that the copies differ? No, that's what I mean by nodatasum means no raid1 self-healing is possible. You have data redundancy, but without checksums btrfs has no way to know if they differ. It doesn't do two reads and compares them, it's just like md raid, it picks one device, and so long as there's no read error from the device, that copy of the data is assumed to be good. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 5:24 PM, Chris Murphy li...@colorremedies.com wrote: There is still data redundancy. Will a scrub at least notice that the copies differ? No, that's what I mean by nodatasum means no raid1 self-healing is possible. You have data redundancy, but without checksums btrfs has no way to know if they differ. It doesn't do two reads and compares them, it's just like md raid, it picks one device, and so long as there's no read error from the device, that copy of the data is assumed to be good. Ok, that makes sense. I'm guessing it wouldn't be worth it to add a feature like this because (a) few people use nodatacow or end up in my situation, and (b) if they did, and the two copies were inconsistent, what would you do? I suppose for me, it would be nice to know which files were affected. -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 4:48 PM, Chris Murphy li...@colorremedies.com wrote: The compress is ignored, and it looks like nodatasum and nodatacow apply to everything. The nodatasum means no raid1 self-healing is possible for any data on the entire volume. Metadata checksumming is still enabled. Ugh. So I need to change my fstab file. I swear, some expert on IRC told me that this should work fine, which is why I did it. In fact, I think they recommended it on the basis that I wanted to put VM images on one of the subvolumes. This discussion occurred a long time ago, well before RAID5 was even partially implemented. There is still data redundancy. Will a scrub at least notice that the copies differ? -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Missing dedupe/locking patch in integration-4.2 tree?
On Tue, Aug 11, 2015 at 09:42:10PM +0200, Holger Hoffstätte wrote: I saw this morning that it went into integration-4.3: https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h=integration-4.3id=293a8489f300536dc6d996c35a6ebb89aa03bab2 So probably just an oversight. Ok thanks for pointing that out Holger! --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs-progs: btrfs balance returns enospc error on a system with 80% free space
I have a recently installed an Arch Linux x86_64 system on a 50GB btrfs partition and every time I try btrfs balance start it gives me an enospc error even though I have less than 20% of the available space full. I have tried the recommended method (from https://btrfs.wiki.kernel.org/index.php/Balance_Filters) and with -dusage I can go up to -dusage=100 with no problems but with -musage it works until 34 and then at musage=35 it fails with the enospc error. I tested if that free space is real by mounting the system without compression and filling the free space with a file written by dd from /dev/zero and the maximum size is exactly the size of the free space that is reported. I have tried deleting all my snapshots and deleting things until I was only using 6 GB out of 50 GB and the exact same errors. I have tried adding more files until I reached 11 GB to see if get write errors when I add lots of small files and no problems and also still the same error (like I said above I have also filled all the free space with one single large file). Here is more detailed information about my setup and output of several commands: uname -a Linux ArchLinux 4.1.4-1-ARCH #1 SMP PREEMPT Mon Aug 3 21:30:37 UTC 2015 x86_64 GNU/Linux btrfs --version btrfs-progs v4.1.2 btrfs fi show Label: 'ArchLinux' uuid: 6816726f-71ed-4b64-9071-60684a445e71 Total devices 1 FS bytes used 9.86GiB devid1 size 50.00GiB used 12.31GiB path /dev/sda2 btrfs fi df / Data, single: total=10.00GiB, used=9.52GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.12GiB, used=354.31MiB GlobalReserve, single: total=128.00MiB, used=0.00B lsblk -o NAME,SIZE,FSTYPE,UUID,PARTLABEL NAMESIZE FSTYPE UUID PARTLABEL sda 50G sda12M BIOS boot partition sda2 50G btrfs 6816726f-71ed-4b64-9071-60684a445e71 Linux x86-64 root (/) sr01024M btrfs filesystem usage / Overall: Device size: 50.00GiB Device allocated: 12.31GiB Device unallocated: 37.68GiB Device missing: 0.00B Used: 10.21GiB Free (estimated): 38.17GiB(min: 19.33GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 128.00MiB(used: 0.00B) Data,single: Size:10.00GiB, Used:9.52GiB /dev/sda2 10.00GiB Metadata,DUP: Size:1.12GiB, Used:354.16MiB /dev/sda2 2.25GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda2 64.00MiB Unallocated: /dev/sda2 37.68GiB -This is the maximum size I have, I have also tested with only 6 GiB used -I suspected that there is a problem with the metadata so I added in fstab metadata_ratio=20 to all subvolumes rebooted and nothing changed. sudo btrfs scrub start -B / scrub done for 6816726f-71ed-4b64-9071-60684a445e71 scrub started at Tue Aug 11 11:07:36 2015 and finished after 00:01:42 total bytes scrubbed: 10.21GiB with 0 errors btrfs check output when run from the rescue cd with the partition unmounted: Checking filesystem on /dev/sda2 UUID: 6816726f-71ed-4b64-9071-60684a445e71 found 10541920267 bytes used err is 0 total csum bytes: 9906264 total tree bytes: 370245632 total fs tree bytes: 337903616 total extent tree bytes: 20758528 btree space waste bytes: 63326339 file data blocks allocated: 10473455616 referenced 14596616192 btrfs-progs v4.1.2 btrfs balance start output with the following options: -dusage 100: Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=100 Done, had to relocate 2 out of 13 chunks -dusage 100, second, third, ... run: Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=100 Done, had to relocate 1 out of 13 chunks -musage 33, first run: Dumping filters: flags 0x6, state 0x0, force is off METADATA (flags 0x2): balancing, usage=33 SYSTEM (flags 0x2): balancing, usage=33 Done, had to relocate 2 out of 13 chunks -musage 33, second, third, run: Dumping filters: flags 0x6, state 0x0, force is off METADATA (flags 0x2): balancing, usage=33 SYSTEM (flags 0x2): balancing, usage=33 Done, had to relocate 1 out of 12 chunks -musage 35 always gives an error: Dumping filters: flags 0x6, state 0x0, force is off METADATA (flags 0x2): balancing, usage=35 SYSTEM (flags 0x2): balancing, usage=35 ERROR: error during balancing '/' - No space left on device There may be more info in syslog - try dmesg | tail output of dmesg | tail (after repeated trying): [ 2481.262199] BTRFS info (device sda2): found 1 extents [ 2487.331921] BTRFS info (device sda2): relocating block group 683432476672 flags 34 [ 2498.583018] BTRFS info (device sda2): relocating block group 683466031104 flags 34 [ 2503.843304] BTRFS info (device sda2): relocating block group 683499585536 flags 34 [ 2511.407124] BTRFS info (device sda2): relocating block group 683533139968 flags 34 [
Re: Usage of new added disk not updated while doing a balance
On 2015-08-11 07:08, Juan Orti Alcaine wrote: Hello, I have added a new disk to my filesystem and I'm doing a balance right now, but I'm a bit worried that the disk usage does not get updated as it should. I remember from earlier versions that you could see the disk usage being balanced across all disks. These are the commands I've run: # btrfs device add /dev/sdb2 /mnt/btrfs_raid1 # btrfs fi balance /mnt/btrfs_raid1 I see the unallocated space of sdc2 and sdd2 increasing, but for sdb2 (the new disk), it doesn't change. sdb2 doesn't even appear in the btrfs usage command for data, metadata and system. Is this normal? It's very strange the disk not showing up in the usage report. How much slack space was allocated by BTRFS before running the balance (ie, how big a difference was there between the allocated and used space), and did the balance run to completion? If you had a lot of mostly empty chunks and stopped the balance part way through, then this is what I would expect to happen (balance back-fills partial chunks before it starts allocating new ones). If that is not the case however, then this is very much _not_ normal, and is almost certainly a bug, in which case you should make sure any important data on the filesystem is backed up before doing anything further with it (including unmounting it or rebooting the system). smime.p7s Description: S/MIME Cryptographic Signature
Usage of new added disk not updated while doing a balance
Hello, I have added a new disk to my filesystem and I'm doing a balance right now, but I'm a bit worried that the disk usage does not get updated as it should. I remember from earlier versions that you could see the disk usage being balanced across all disks. These are the commands I've run: # btrfs device add /dev/sdb2 /mnt/btrfs_raid1 # btrfs fi balance /mnt/btrfs_raid1 I see the unallocated space of sdc2 and sdd2 increasing, but for sdb2 (the new disk), it doesn't change. sdb2 doesn't even appear in the btrfs usage command for data, metadata and system. Is this normal? It's very strange the disk not showing up in the usage report. # btrfs --version btrfs-progs v4.1 # uname -a Linux xenon 4.1.3-201.fc22.x86_64 #1 SMP Wed Jul 29 19:50:22 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux # btrfs fi usage /mnt/btrfs_raid1 Overall: Device size: 5.44TiB Device allocated: 2.74TiB Device unallocated:2.70TiB Device missing: 0.00B Used: 2.61TiB Free (estimated): 1.42TiB (min: 1.42TiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 16.00KiB) Data,RAID1: Size:1.36TiB, Used:1.30TiB /dev/sdc2 1.36TiB /dev/sdd2 1.36TiB Metadata,RAID1: Size:10.00GiB, Used:8.11GiB /dev/sdc2 10.00GiB /dev/sdd2 10.00GiB System,RAID1: Size:32.00MiB, Used:224.00KiB /dev/sdc2 32.00MiB /dev/sdd2 32.00MiB Unallocated: /dev/sdb2 1.81TiB /dev/sdc2 454.48GiB /dev/sdd2 454.48GiB # btrfs fi df /mnt/btrfs_raid1 Data, RAID1: total=1.36TiB, used=1.30TiB System, RAID1: total=32.00MiB, used=224.00KiB Metadata, RAID1: total=10.00GiB, used=8.13GiB GlobalReserve, single: total=512.00MiB, used=18.83MiB # btrfs fi show /mnt/btrfs_raid1 Label: 'btrfs_raid1' uuid: 03eeb44b-de69-4f1f-9261-70bd7a5c6de0 Total devices 3 FS bytes used 1.30TiB devid1 size 1.81TiB used 1.37TiB path /dev/sdc2 devid2 size 1.81TiB used 1.37TiB path /dev/sdd2 devid3 size 1.81TiB used 0.00B path /dev/sdb2 btrfs-progs v4.1 And the kernel log: ago 11 11:54:45 xenon kernel: BTRFS info (device sdd2): disk added /dev/sdb2 ago 11 11:56:18 xenon kernel: BTRFS info (device sdd2): relocating block group 1715902349312 flags 17 ago 11 11:56:36 xenon kernel: BTRFS info (device sdd2): found 12127 extents ago 11 12:09:52 xenon kernel: BTRFS info (device sdd2): found 12127 extents ago 11 12:09:56 xenon kernel: BTRFS info (device sdd2): relocating block group 1714828607488 flags 17 ago 11 12:10:11 xenon kernel: BTRFS info (device sdd2): found 1076 extents ago 11 12:11:24 xenon kernel: BTRFS info (device sdd2): found 1076 extents ago 11 12:11:25 xenon kernel: BTRFS info (device sdd2): relocating block group 1713754865664 flags 17 ago 11 12:11:37 xenon kernel: BTRFS info (device sdd2): found 8 extents ago 11 12:11:50 xenon kernel: BTRFS info (device sdd2): found 8 extents ago 11 12:11:50 xenon kernel: BTRFS info (device sdd2): relocating block group 1712681123840 flags 17 ago 11 12:12:16 xenon kernel: BTRFS info (device sdd2): found 1432 extents ago 11 12:13:17 xenon kernel: BTRFS info (device sdd2): found 1432 extents [...] -- Juan Orti https://miceliux.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Usage of new added disk not updated while doing a balance
2015-08-11 15:20 GMT+02:00 Austin S Hemmelgarn ahferro...@gmail.com: How much slack space was allocated by BTRFS before running the balance (ie, how big a difference was there between the allocated and used space), and did the balance run to completion? If you had a lot of mostly empty chunks and stopped the balance part way through, then this is what I would expect to happen (balance back-fills partial chunks before it starts allocating new ones). If that is not the case however, then this is very much _not_ normal, and is almost certainly a bug, in which case you should make sure any important data on the filesystem is backed up before doing anything further with it (including unmounting it or rebooting the system). I don't have the usage numbers before running the balance, but I have around 1000 readonly snapshots, so maybe that's a factor. It keeps running and using both CPU and IO, so I'll wait to see what happens. Thank you. -- Juan Orti https://miceliux.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
On 08/11/2015 01:07 AM, Marc MERLIN wrote: On Sun, Aug 02, 2015 at 08:51:30PM -0700, Marc MERLIN wrote: On Fri, Jul 24, 2015 at 09:24:46AM -0700, Marc MERLIN wrote: Screenshot: https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/tmp/btrfs_crash.jpgk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0Am=BIMTuuT5G3PNqsD7rUX5Uzfyd1xL9vQIECC7sPpJh5U%3D%0As=5a4e737cf6e23a884121a0bd2c935edb9e7011394b6b59b109c11716a562000b So it's 32bit system, 3.19.8, crashing during snapshot deletion and backref walking. EIP is in do_walk_down+0x142. I've tried to match it to the sources on a local 32bit build, but it does not point to the expected crash site: Thanks for looking. Unfortunately it's a mythtv where if I put a 64bit kernel, other things go wrong with the 32bit userland/64bit kernel split. But I'll put a newer 64bit kernel on it to see what happens and report back. I got home, built the last kernel and got netconsole working. 4.1.3/64bit and 32bit crash the same way. So, it's been several weeks that I can't use this filesystem. Is anyone interested in fixing the kernel bug before I wipe it? (as in, even if the FS is corrupted, it should not crash the kernel) From a48cf7a9ae44a17d927df5542c8b0be287aee9ed Mon Sep 17 00:00:00 2001 From: Josef Bacik jba...@fb.com Date: Tue, 11 Aug 2015 11:39:37 -0400 Subject: [PATCH] Btrfs: kill BUG_ON() in btrfs_lookup_extent_info() Replace it with an ASSERT(0) for the developers and an error for not the developers. Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/extent-tree.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 5411f0a..f7fb120 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -818,7 +818,11 @@ search_again: BUG(); #endif } - BUG_ON(num_refs == 0); + if (num_refs == 0) { + ASSERT(0); + ret = -EIO; + goto out_free; + } } else { num_refs = 0; extent_flags = 0; @@ -859,7 +863,6 @@ search_again: } spin_unlock(delayed_refs-lock); out: - WARN_ON(num_refs == 0); if (refs) *refs = num_refs; if (flags) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 12:21 AM, Chris Murphy li...@colorremedies.com wrote: On Mon, Aug 10, 2015 at 7:23 PM, Timothy Normand Miller theo...@gmail.com wrote: On Mon, Aug 10, 2015 at 6:52 PM, Chris Murphy li...@colorremedies.com wrote: - complete dmesg for the failed mount It really doesn't say much. I have things like this: [8.643535] BTRFS info (device sdc): disk space caching is enabled [8.643789] BTRFS: failed to read the system array on sdc [8.706062] BTRFS: open_ctree failed [8.707124] BTRFS info (device sdc): disk space caching is enabled [8.710924] BTRFS: failed to read the system array on sdc [8.766080] BTRFS: open_ctree failed [8.766903] BTRFS info (device sdc): setting nodatacow, compression disabled [8.766905] BTRFS info (device sdc): disk space caching is enabled [8.767152] BTRFS: failed to read the system array on sdc [8.936019] BTRFS: open_ctree failed [8.936906] BTRFS info (device sdc): disk space caching is enabled [8.939922] BTRFS: failed to read the system array on sdc [8.995984] BTRFS: open_ctree failed [8.996796] BTRFS info (device sdc): disk space caching is enabled [8.997093] BTRFS: failed to read the system array on sdc [9.125936] BTRFS: open_ctree failed It looks like there's not enough redundancy remaining to mount and in such a case there's really not much to be done. I don't see nodatacow in your fstab, so I don't know why that's happening. That means no checksumming for data. Sorry. I was dumb. I only showed you the entry for what I was trying to mount manually. I have subvolumes, and this is what is in my fstab: UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /home btrfs compress=lzo,noatime,space_cache,subvol=home 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/btrfs btrfs compress=lzo,noatime,space_cache 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/vms btrfs noatime,nodatacow,space_cache,subvol=vms 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/oldfiles btrfs compress=lzo,noatime,space_cache,subvol=oldfiles 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/backup btrfs compress=lzo,noatime,space_cache,subvol=backup 0 2 Also, when I manually try to mount, I get things like this: # mount /mnt/btrfs mount: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error Have you tried to mount with -o degraded? Ooh! I can do that! Mounting ro,degraded, I see this: [94197.902443] BTRFS info (device sdc): allowing degraded mounts [94197.902448] BTRFS info (device sdc): disk space caching is enabled [94198.240621] BTRFS: bdev (null) errs: wr 1724, rd 305, flush 45, corrupt 0, gen 2 Mounting rw,degraded, I see this: [94312.091613] BTRFS info (device sdc): allowing degraded mounts [94312.091618] BTRFS info (device sdc): disk space caching is enabled [94312.194513] BTRFS: bdev (null) errs: wr 1724, rd 305, flush 45, corrupt 0, gen 2 [94319.824563] BTRFS: checking UUID tree Well, if I get something lengthy, I'll attach it to my bug report. Did the information I reported help at all? The entire dmesg is still useful because it should show libata errors if these aren't fully failed drives. So you should file a bug and include, literally, the entire unedited dmesg. Alright, I'll do that. Thanks! -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 1:56 PM, Timothy Normand Miller theo...@gmail.com wrote: On Tue, Aug 11, 2015 at 12:21 AM, Chris Murphy li...@colorremedies.com wrote: The entire dmesg is still useful because it should show libata errors if these aren't fully failed drives. So you should file a bug and include, literally, the entire unedited dmesg. Alright, I'll do that. Thanks! Here you go: https://bugzilla.kernel.org/show_bug.cgi?id=102691 -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Scaling to 100k+ snapshots/subvolumes
Hi, In an early thread Duncan mentioned that btrfs does not scale well in the number of subvolumes (including snapshots). He recommended keeping the total number under 1000. I just wanted to understand this limitation further. Is this something that has been resolved or will be resolved in the future or is it something inherent to the design of btrfs? We have an application that could easily generate 100k-1M snapshots and 10s of thousands of subvolumes. We use snapshots to track very fine-grained filesystem histories and subvolumes to enforce quotas across a large number of distinct projects. Thanks, Tristan Duncan http://permalink.gmane.org/gmane.comp.file-systems.btrfs/43910 The question of number of subvolumes normally occurs in the context of snapshots, since snapshots are a special kind of subvolume. Ideally, you'll want to keep the total number of subvolumes (including snapshots) to under 1000, with the number of snapshots of any single subvolume limited to 250-ish (say under 300). However, even just four subvolumes being snapshotted to this level will reach the thousand, and 2000-3000 total isn't /too/ bad as long as it's no more than 250-300 snapshots per subvolume. But DEFINITELY try to keep it under 3000, and preferably under 2000, as the scaling really does start to go badly as the number of subvolumes increases beyond that. If you're dealing with 10k subvolumes/ snapshots, that's too many and you ARE likely to find yourself with problems. (With something like snapper, configuring it for say half-hour or hourly snapshots at the shortest time, with twice-daily or daily being more reasonable in many circumstances, and then thinning it down to say daily after a few days and weekly after four weeks, goes quite a long way toward reducing the number of snapshots per subvolume. Keeping it near 250-ish per subvolume is WELL within reason, and considering that a month or a year out, you're not likely to /care/ whether it's this hour or that, just pick a day or a week and if it's not what you want, go back or forward a day or a week, is actually likely to be more practical than having hundreds of half-hourly snapshots a year old to choose from. And 250-ish snapshots per subvolume really does turn out to be VERY reasonable, provided you're doing reasonable thinning.) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Missing dedupe/locking patch in integration-4.2 tree?
On Fri, Aug 07, 2015 at 10:11:46AM +0200, Holger Hoffstätte wrote: Mark's patch titled [PATCH 3/5] btrfs: fix clone / extent-same deadlocks [1] from his btrfs: dedupe fixes, features series is missing from the integration-4.2 tree and 4.2-rc5, where it still applies cleanly (as of 5 mins ago). Any particular reason why this was silently dropped? Same question here, I noticed this shortly after everything went upstream and was planning a resend but it would be great if you could tell us whether something was wrong with the patch or if it just got lost in the shuffle (no big deal). --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Missing dedupe/locking patch in integration-4.2 tree?
On 08/11/15 20:58, Mark Fasheh wrote: On Fri, Aug 07, 2015 at 10:11:46AM +0200, Holger Hoffstätte wrote: Mark's patch titled [PATCH 3/5] btrfs: fix clone / extent-same deadlocks [1] from his btrfs: dedupe fixes, features series is missing from the integration-4.2 tree and 4.2-rc5, where it still applies cleanly (as of 5 mins ago). Any particular reason why this was silently dropped? Same question here, I noticed this shortly after everything went upstream and was planning a resend but it would be great if you could tell us whether something was wrong with the patch or if it just got lost in the shuffle (no big deal). --Mark I saw this morning that it went into integration-4.3: https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h=integration-4.3id=293a8489f300536dc6d996c35a6ebb89aa03bab2 So probably just an oversight. -h -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 3:57 PM, Chris Murphy li...@colorremedies.com wrote: On Tue, Aug 11, 2015 at 12:04 PM, Timothy Normand Miller theo...@gmail.com wrote: https://bugzilla.kernel.org/show_bug.cgi?id=102691 [7.729124] BTRFS: device fsid ecdff84d-b4a2-4286-a1c1-cd7e5396901c devid 2 transid 226237 /dev/sdd [7.746115] BTRFS: device fsid ecdff84d-b4a2-4286-a1c1-cd7e5396901c devid 4 transid 226237 /dev/sdb [7.826493] BTRFS: device fsid ecdff84d-b4a2-4286-a1c1-cd7e5396901c devid 3 transid 226237 /dev/sdc What do you get for 'btrfs fi show' # btrfs fi show Label: none uuid: 49ac9ad2-b529-4e6e-aef9-1c5b9e8a72f8 Total devices 1 FS bytes used 28.33GiB devid1 size 79.69GiB used 41.03GiB path /dev/sda3 Label: none uuid: ecdff84d-b4a2-4286-a1c1-cd7e5396901c Total devices 4 FS bytes used 1.46TiB devid2 size 931.51GiB used 767.00GiB path /dev/sdd devid3 size 931.51GiB used 760.03GiB path /dev/sdc devid4 size 931.51GiB used 767.00GiB path /dev/sdb *** Some devices missing Label: none uuid: f9331766-e50a-43d5-98dc-fabf5c68321d Total devices 1 FS bytes used 2.99TiB devid1 size 3.64TiB used 3.01TiB path /dev/sde1 btrfs-progs v4.1.2 I see devid 2, 3, 4 only for this volume UUID. So you definitely appear to have a failed device and that's why it doesn't mount automatically at boot time. You just need to use -o degraded, and that should work assuming no problems with the other three devices. If it does work, 'btrfs replace start...' is the ideal way to replace the failed drive. It's missing because I physically disconnected it. Someone on IRC suggested I try this in case the drive with the bad sector was interfering. Of course, now that I've done this and mounted read/write, we can't reintegrate the failing drive. If I lose the array, I won't cry. The backup appears to be complete. But it would be convenient to avoid having to restore from scratch, and I'm hoping this might help you guys too in some way. I really like btrfs, and I would like provide you with whatever info might contribute something. Maybe someone else can say whether nodatacow as a subvolume mount option will apply this to the entire volume. At the moment, I'm only trying to mount the whole volume, just so I could recover and scrub it, although as I mentioned in my earlier email, the scrub aborts with no report of why and with 0 errors. -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 12:04 PM, Timothy Normand Miller theo...@gmail.com wrote: https://bugzilla.kernel.org/show_bug.cgi?id=102691 [7.729124] BTRFS: device fsid ecdff84d-b4a2-4286-a1c1-cd7e5396901c devid 2 transid 226237 /dev/sdd [7.746115] BTRFS: device fsid ecdff84d-b4a2-4286-a1c1-cd7e5396901c devid 4 transid 226237 /dev/sdb [7.826493] BTRFS: device fsid ecdff84d-b4a2-4286-a1c1-cd7e5396901c devid 3 transid 226237 /dev/sdc What do you get for 'btrfs fi show' I see devid 2, 3, 4 only for this volume UUID. So you definitely appear to have a failed device and that's why it doesn't mount automatically at boot time. You just need to use -o degraded, and that should work assuming no problems with the other three devices. If it does work, 'btrfs replace start...' is the ideal way to replace the failed drive. Maybe someone else can say whether nodatacow as a subvolume mount option will apply this to the entire volume. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 11:56 AM, Timothy Normand Miller theo...@gmail.com wrote: On Tue, Aug 11, 2015 at 12:21 AM, Chris Murphy li...@colorremedies.com wrote: I don't see nodatacow in your fstab, so I don't know why that's happening. That means no checksumming for data. Sorry. I was dumb. I only showed you the entry for what I was trying to mount manually. I have subvolumes, and this is what is in my fstab: UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /home btrfs compress=lzo,noatime,space_cache,subvol=home 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/btrfs btrfs compress=lzo,noatime,space_cache 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/vms btrfs noatime,nodatacow,space_cache,subvol=vms 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/oldfiles btrfs compress=lzo,noatime,space_cache,subvol=oldfiles 0 2 UUID=ecdff84d-b4a2-4286-a1c1-cd7e5396901c /mnt/backup btrfs compress=lzo,noatime,space_cache,subvol=backup 0 2 Huh. I thought nodatacow applies to an entire volume only, not per subvolume unless you use chattr +C (in which case it can be per subvolume, directory or per file). I could be confused, but I think you have mutually exclusive mount options. Have you tried to mount with -o degraded? Ooh! I can do that! Mounting ro,degraded, I see this: [94197.902443] BTRFS info (device sdc): allowing degraded mounts [94197.902448] BTRFS info (device sdc): disk space caching is enabled [94198.240621] BTRFS: bdev (null) errs: wr 1724, rd 305, flush 45, corrupt 0, gen 2 Mounting rw,degraded, I see this: [94312.091613] BTRFS info (device sdc): allowing degraded mounts [94312.091618] BTRFS info (device sdc): disk space caching is enabled [94312.194513] BTRFS: bdev (null) errs: wr 1724, rd 305, flush 45, corrupt 0, gen 2 [94319.824563] BTRFS: checking UUID tree I don't see any mount failure message. It worked then? -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Scaling to 100k+ snapshots/subvolumes
If someone can answer Tristan's question, can they also add in if large volumes of frequently created and destroyed snapshots/subvolumes will cause issues? Or, if they're deleted quickly after being made, is it just the number that exists at any given time that matters? (Building source in chroot subvolumes with its own o/s install, as in Arch's devtools scripts, if used to create *separate* subvolumes under each source build, rather than a global shared one.) On Tue, Aug 11, 2015 at 6:33 PM, Tristan Zajonc tris...@sense.io wrote: Hi, In an early thread Duncan mentioned that btrfs does not scale well in the number of subvolumes (including snapshots). He recommended keeping the total number under 1000. I just wanted to understand this limitation further. Is this something that has been resolved or will be resolved in the future or is it something inherent to the design of btrfs? We have an application that could easily generate 100k-1M snapshots and 10s of thousands of subvolumes. We use snapshots to track very fine-grained filesystem histories and subvolumes to enforce quotas across a large number of distinct projects. Thanks, Tristan Duncan http://permalink.gmane.org/gmane.comp.file-systems.btrfs/43910 The question of number of subvolumes normally occurs in the context of snapshots, since snapshots are a special kind of subvolume. Ideally, you'll want to keep the total number of subvolumes (including snapshots) to under 1000, with the number of snapshots of any single subvolume limited to 250-ish (say under 300). However, even just four subvolumes being snapshotted to this level will reach the thousand, and 2000-3000 total isn't /too/ bad as long as it's no more than 250-300 snapshots per subvolume. But DEFINITELY try to keep it under 3000, and preferably under 2000, as the scaling really does start to go badly as the number of subvolumes increases beyond that. If you're dealing with 10k subvolumes/ snapshots, that's too many and you ARE likely to find yourself with problems. (With something like snapper, configuring it for say half-hour or hourly snapshots at the shortest time, with twice-daily or daily being more reasonable in many circumstances, and then thinning it down to say daily after a few days and weekly after four weeks, goes quite a long way toward reducing the number of snapshots per subvolume. Keeping it near 250-ish per subvolume is WELL within reason, and considering that a month or a year out, you're not likely to /care/ whether it's this hour or that, just pick a day or a week and if it's not what you want, go back or forward a day or a week, is actually likely to be more practical than having hundreds of half-hourly snapshots a year old to choose from. And 250-ish snapshots per subvolume really does turn out to be VERY reasonable, provided you're doing reasonable thinning.) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
On Tue, Aug 11, 2015 at 2:26 PM, Timothy Normand Miller theo...@gmail.com wrote: On Tue, Aug 11, 2015 at 3:47 PM, Chris Murphy li...@colorremedies.com wrote: Huh. I thought nodatacow applies to an entire volume only, not per subvolume unless you use chattr +C (in which case it can be per subvolume, directory or per file). I could be confused, but I think you have mutually exclusive mount options. Well, at the time I set up this system, I asked on IRC, and people said it should work. I've never seen any errors from this. Error implies a mistake with some sort of reference. All Btrfs does is inform you of conflicting options: [8.766903] BTRFS info (device sdc): setting nodatacow, compression disabled Unfortunately that message should not reference just one device but a volume label or UUID in my opinion. When I test manual mount of subvolume first with -o compress, followed by mount of another subvolume with -o nodatacow, this is the results from mount: /dev/sdb on /var/mnt/root type btrfs (rw,relatime,seclabel,compress=zlib,space_cache) /dev/sdb on /var/mnt/home type btrfs (rw,relatime,seclabel,compress=zlib,space_cache) When I do -o nodatacow first, followed by -o compress /dev/sdb on /var/mnt/root type btrfs (rw,relatime,seclabel,nodatasum,nodatacow,space_cache) /dev/sdb on /var/mnt/home type btrfs (rw,relatime,seclabel,nodatasum,nodatacow,space_cache) The compress is ignored, and it looks like nodatasum and nodatacow apply to everything. The nodatasum means no raid1 self-healing is possible for any data on the entire volume. Metadata checksumming is still enabled. [94312.091613] BTRFS info (device sdc): allowing degraded mounts [94312.091618] BTRFS info (device sdc): disk space caching is enabled [94312.194513] BTRFS: bdev (null) errs: wr 1724, rd 305, flush 45, corrupt 0, gen 2 [94319.824563] BTRFS: checking UUID tree I don't see any mount failure message. It worked then? Yes and no. It's mounted, but a scrub aborts silently: # btrfs scrub status /mnt/btrfs/ scrub status for ecdff84d-b4a2-4286-a1c1-cd7e5396901c scrub started at Tue Aug 11 13:56:36 2015 and was aborted after 01:31:55 total bytes scrubbed: 2.19TiB with 0 errors No new messages appeared in dmesg, so I can't tell why it aborted. It's also odd that it reports zero errors, given that it aborted. Well I wouldn't expect a scrub to completely work in this, even though it probably should fail more gracefully than this: a.) you don't have a complete array I don't know what a scrub of a degraded volume even means b.) the data has no checksums so the only thing that can really be scrubbed is metadata So I'd say there are three UI/UX bugs here: a.) The info message about nodatacow overriding compression should refer to label and/or UUID not device. b.) When degraded, a scrub should give more meaningful information on the scope of what can and can't be done; or at the least say scrub isn't possible on degraded volumes. c.) When nodatacow, scrub should scrub metadata and inform user metadata is scrubbed but data can't be scrubbed due to nodatacow. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
Timothy Normand Miller posted on Tue, 11 Aug 2015 17:32:12 -0400 as excerpted: On Tue, Aug 11, 2015 at 5:24 PM, Chris Murphy li...@colorremedies.com wrote: There is still data redundancy. Will a scrub at least notice that the copies differ? No, that's what I mean by nodatasum means no raid1 self-healing is possible. You have data redundancy, but without checksums btrfs has no way to know if they differ. It doesn't do two reads and compares them, it's just like md raid, it picks one device, and so long as there's no read error from the device, that copy of the data is assumed to be good. Ok, that makes sense. I'm guessing it wouldn't be worth it to add a feature like this because (a) few people use nodatacow or end up in my situation, and (b) if they did, and the two copies were inconsistent, what would you do? I suppose for me, it would be nice to know which files were affected. FWIW, nodatacow and nodatasum are intended to /eventually/ be per- subvolume mount options. The infrastructure is there to make it so. It's just that the code to actually handle those mount options separately per subvolume doesn't exist yet, so they apply globally. Similarly, the intention is to eventually allow per-subvolume and possibly even per-file raid-level specifications, while currently, the whole filesystem must be set to the same raid level (except that data and metadata raid levels are set separately). It is currently possible to have multiple raid levels, but only because a raid-level conversion was started (either due to a balance-convert, or due to adding a second device changing the metadata default to raid1 from dup, for instance) and never finished. So it's not so much a question of not worth it to add the no-checksum data redundancy scrub feature, it's that nodatacow and nodatasum are really intended to be exceptions where the admin has specifically disabled the checksumming, and are not intended to ever apply to a full filesystem, only, at most, to a particular subvolume. The fact that if the mount option is used today it applies to the full filesystem is simply a temporary situational accident of not having the per-subvolume mount-option code implemented yet. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs: btrfs balance returns enospc error on a system with 80% free space
Catalin posted on Tue, 11 Aug 2015 12:18:28 +0300 as excerpted: I have a recently installed an Arch Linux x86_64 system on a 50GB btrfs partition and every time I try btrfs balance start it gives me an enospc error even though I have less than 20% of the available space full. I have tried the recommended method (from https://btrfs.wiki.kernel.org/index.php/Balance_Filters) and with -dusage I can go up to -dusage=100 with no problems but with -musage it works until 34 and then at musage=35 it fails with the enospc error. Here is more detailed information about my setup and output of several commands: uname -[r] 4.1.4-1-ARCH btrfs --version btrfs-progs v4.1.2 Thanks. That's about the first thing we ask for, and you're current on both kernel and userspace. =:^) btrfs fi show Label: 'ArchLinux' uuid: 6816726f-71ed-4b64-9071-60684a445e71 Total devices 1 FS bytes used 9.86GiB devid1 size 50.00GiB used 12.31GiB path /dev/sda2 btrfs fi df / Data, single: total=10.00GiB, used=9.52GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.12GiB, used=354.31MiB GlobalReserve, single: total=128.00MiB, used=0.00B Second thing we ask for. =:^) (FWIW, usage is a newer command that basically combines the info of both of these, printing it in an often more understandable format. But regulars are used to dealing with these older ones, so I omitted your usage output.) 50 GiB single-device filesystem, only 12.31 GiB allocated, default single data, dup metadata. All healthy here. =:^) [btrfs check and scrub returned no errors] btrfs balance start output with the following options: -dusage 100: Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=100 Done, had to relocate 2 out of 13 chunks -dusage 100, second, third, ... run: Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=100 Done, had to relocate 1 out of 13 chunks It's unlikely to help, but when you're doing 100% anyway, you can simply use -d, IOW, tell balance data-only, but no filters. Again, -d should work, but shouldn't help. Of course you can do the same with metadata, but that's unlikely to work, since we already know a metadata balance dies with a chunk that's between 33 and 35 percent full, and as soon as it hits it... -musage 33, first run: Dumping filters: flags 0x6, state 0x0, force is off METADATA (flags 0x2): balancing, usage=33 SYSTEM (flags 0x2): balancing, usage=33 Done, had to relocate 2 out of 13 chunks -musage 33, second, third, run: Dumping filters: flags 0x6, state 0x0, force is off METADATA (flags 0x2): balancing, usage=33 SYSTEM (flags 0x2): balancing, usage=33 Done, had to relocate 1 out of 12 chunks -musage 35 always gives an error: Dumping filters: flags 0x6, state 0x0, force is off METADATA (flags 0x2): balancing, usage=35 SYSTEM (flags 0x2): balancing, usage=35 ERROR: error during balancing '/' - No space left on device There may be more info in syslog - try dmesg | tail output of dmesg | tail (after repeated trying): [Nothing much, reallocating blocks, ENOSPC error.] cat /etc/fstab # /dev/sda2 LABEL=ArchLinux UUID=6816726f-71ed-4b64-9071-60684a445e71/btrfs rw,noatime,compress-force=lzo,space_cache,autodefrag 0 0 [subvolume mounts of the same btrfs omitted, subvolume/snapshot list omitted.] (like I said I have also tried with all the snapshots deleted) I have tried running the command both from inside the system and mounted from a rescue cd with different combinations of mount options like enabling and disabling space-cache / nospace_cache , clear_cache, enospc_debug, enable and disable compression or autodefrag. I have tried defragmenting everything, filling all the space, adding files, deleting files, making snapshots, deleting snapshots still the same problem. I have run the balance command on both the root subvolume and on subvolid=0. I have tried putting the balance commands with options that work inside a for to run 1000 times hoping that maybe that one relocated chunk it says about might actually solve something in time but it doesn't (I am new to btrfs and not 100% about how balance works). Everything else works fine, the system is very fast, good compression, no other errors and I have no other problems but the fact that I have this error means something is wrong and I don't know what is the problem and how to solve it. You really have both included all sorts of info, and tried all sorts of stuff. Top marks on that! But unfortunately it's not helping with the problem... One question. You said you _recently_ installed. Just how recently, or more directly, what version of btrfs-progs did you use for the mkfs.btrfs? Or was it perhaps a conversion from ext*? I ask, because... the mkfs.btrfs from btrfs-progs v4.1.1 had a critical bug, with v4.1.2 released along with a message
Re: Scaling to 100k+ snapshots/subvolumes
Tristan Zajonc posted on Tue, 11 Aug 2015 11:33:45 -0700 as excerpted: In an early thread Duncan mentioned that btrfs does not scale well in the number of subvolumes (including snapshots). He recommended keeping the total number under 1000. I just wanted to understand this limitation further. Is this something that has been resolved or will be resolved in the future or is it something inherent to the design of btrfs? It is not resolved yet, but it's definitely on the radar. I don't personally understand the details well enough to know if the problem is inherent to btrfs, or if some optimized rewrite down the road is likely to at least yield linear scaling. On the practical side, one related thing I do know is that this is the reason snapshot-aware-defrag was disabled a few kernel cycles after being introduced -- it simply didn't scale, and the thought was, better a defrag that at least worked for the snapshot you pointed it at, even at the cost of increasing usage due to COW if other snapshots pointed at the same file extents, than a defrag that basically didn't work at all. But the intent remains to at least get scaling working well enough to have snapshot-aware-defrag again. So when snapshot-aware-defrag is enabled again, that's your clue that things should be scaling at least /reasonably/ well, and it's time to reexamine the situation. Until then, I'd not recommend trying it. We have an application that could easily generate 100k-1M snapshots and 10s of thousands of subvolumes. We use snapshots to track very fine-grained filesystem histories and subvolumes to enforce quotas across a large number of distinct projects. Btrfs quotas... have been another sticky wicket on btrfs, both as earlier the code was simply broken (tho AFAIK that's fixed in general, now), and because due to the way it works, quota tracking multiplies the scaling issues several fold (certainly in the original code form). AFAIK they've actually done at least two partial rewrites, so are on the third quota code version now. The third-try quota code is fresh enough I don't think people know yet how well it's going to perform in deployment. As a result of that quota code history, my recommendation has been that unless you're deliberately testing it, if you don't need quotas, keep it turned off on btrfs and avoid the issues it has been known, at least historically, to trigger. As btrfs quota code is demonstrably not yet stable and reliable enough to use, if you *do* actually depend on quotas, you should definitely be on some other filesystem where the quota code is well tested and known to be dependable, as that simply doesn't describe btrfs quota code at this point. But there's actually some pretty big effort going into the quota code at the moment, this the fact that we're on the third version now, and they're definitely planning on it actually working, or they'd not be sinking the effort into it that they are. And as I said, the quota code was multiplying the scaling issues several fold, so getting quotas actually working well is a big part of getting the scaling issues fixed as well. But beyond that; in particular, whether it's ever likely to work at the scales you mention above, is something you'd have to ask the devs, as I'm just a list regular and btrfs-using admin, with a use-case that doesn't directly involve either quotas or subvolumes/snapshotting to any great degree. So while I can point to the current situation and the current trend and work areas, I have effectively no idea if scaling to the numbers you mention above is even technically possible, or not. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Damaged filesystem, can read, can't repair, error says to contact devs
Russell Coker posted on Wed, 12 Aug 2015 13:04:27 +1000 as excerpted: Linux Software RAID scrub will copy the data from one disk to the other to make them identical, the theory is that it's best to at least be consistent if you can't be sure you are right. Will a BTRFS scrub do this on a non-CoW file? While I honestly don't know, a reasonably educated guess is no, btrfs scrub assumes COW and checksumming, and doesn't do anything if there's no checksums to verify against. (I know scrub skips verification of items without checksums in the normally checksummed case, so it's reasonable to assume it would skip all data, verifying only metadata, if no data has checksums.) In a case like this where nodatacow has disabled checksumming, the best thing one can do is manually check at least some samples, and if there's no visible corruption, assume the existing data is correct and (after a scrub to verify at least metadata) do a btrfs check --init-csum-tree, to initialize the checksums to at least cover the existing situation, whatever it may be. Scrub should be able to work after that, since it has csums to work with. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html