Re: raid6, disks of different sizes, ENOSPC errors despite having plenty of space
Thanks, this makes sense. I freed up some space and the re-balance back to raid1 is now running (I had to run 'btrfs balance -dusage=5' before some free space actually became available). Filed the other issue as https://bugzilla.kernel.org/show_bug.cgi?id=74761 . On Wed, Apr 23, 2014 at 6:53 PM, Hugo Mills h...@carfax.org.uk wrote: On Wed, Apr 23, 2014 at 05:04:10PM -0400, Sergey Ivanyuk wrote: Hi, I have a filesystem that I've converted to raid6 from raid1, on 4 drives (I have another copy of the data): Total devices 4 FS bytes used 924.64GiB devid1 size 1.82TiB used 474.00GiB path /dev/sdd devid2 size 465.76GiB used 465.76GiB path /dev/sda devid3 size 465.76GiB used 465.76GiB path /dev/sdb devid4 size 465.76GiB used 465.73GiB path /dev/sdc Data, RAID6: total=924.00GiB, used=923.42GiB System, RAID1: total=32.00MiB, used=208.00KiB Metadata, RAID1: total=1.70GiB, used=1.28GiB Metadata, DUP: total=384.00MiB, used=252.13MiB unknown, single: total=512.00MiB, used=0.00 Recent btrfs-progs built from source, kernel 3.15.0-rc2 on armv7l. Despite having plenty of space left on the larger drive, attempting to copy more data onto the filesystem results in a kworker process pegged at 100% CPU for a very long time (10s of minutes), at which point the writes proceed for some time, and the process repeats until the eventual No space left on device error. Balancing fails with the same error, even if attempting to convert back to raid1. I realize that this likely has something to do with the disparity between device sizes, and per the wiki a fixed-width stripe may help, though I'm not sure if it's possible to change the stripe width in my situation, since I can't rebalance. Is there anything I can do to get this filesystem back to writable state? With those device sizes, yes, you're going to have limits on the available data you can store -- with RAID-6, it'll be 465.76*(4-2) = 931.52 GB (less metadata space), so your conclusion above is indeed correct. We don't have the fixed-width stripe feature implemented yet, which probably explains why you can't use it. :) You can play with an approximation of the consequences, once the feature is there, at http://carfax.org.uk/btrfs-usage/ . Without that feature, though, there's not much you can do to improve the situation. What might help in converting back to RAID-1 is adding a small device to the FS temporarily before doing the conversion, and then removing it again afterwards. Also, here's a stack trace for the stuck kworker process, which appears to be a bug since it does this for a very long time: This is probably something different. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Computer Science is not about computers, any more than --- astronomy is about telescopes. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
raid6, disks of different sizes, ENOSPC errors despite having plenty of space
Hi, I have a filesystem that I've converted to raid6 from raid1, on 4 drives (I have another copy of the data): Total devices 4 FS bytes used 924.64GiB devid1 size 1.82TiB used 474.00GiB path /dev/sdd devid2 size 465.76GiB used 465.76GiB path /dev/sda devid3 size 465.76GiB used 465.76GiB path /dev/sdb devid4 size 465.76GiB used 465.73GiB path /dev/sdc Data, RAID6: total=924.00GiB, used=923.42GiB System, RAID1: total=32.00MiB, used=208.00KiB Metadata, RAID1: total=1.70GiB, used=1.28GiB Metadata, DUP: total=384.00MiB, used=252.13MiB unknown, single: total=512.00MiB, used=0.00 Recent btrfs-progs built from source, kernel 3.15.0-rc2 on armv7l. Despite having plenty of space left on the larger drive, attempting to copy more data onto the filesystem results in a kworker process pegged at 100% CPU for a very long time (10s of minutes), at which point the writes proceed for some time, and the process repeats until the eventual No space left on device error. Balancing fails with the same error, even if attempting to convert back to raid1. I realize that this likely has something to do with the disparity between device sizes, and per the wiki a fixed-width stripe may help, though I'm not sure if it's possible to change the stripe width in my situation, since I can't rebalance. Is there anything I can do to get this filesystem back to writable state? Also, here's a stack trace for the stuck kworker process, which appears to be a bug since it does this for a very long time: Exception stack(0xab4699c8 to 0xab469a10) 99c0: aec7c870 aec7c841 0800 aec7c870 99e0: ab469ad0 bd51e880 3000 0006c000 0005 ab469a10 9a00: 80299c8c 80310098 200e0013 [80011e80] (__irq_svc) from [80310098] (rb_next+0x14/0x5c) [80310098] (rb_next) from [80299c8c] (btrfs_find_space_for_alloc+0x138/0x344) [80299c8c] (btrfs_find_space_for_alloc) from [80240020] (find_free_extent+0x378/0xabc) [80240020] (find_free_extent) from [80240840] (btrfs_reserve_extent+0xdc/0x164) [80240840] (btrfs_reserve_extent) from [8025aef4] (cow_file_range+0x17c/0x5bc) [8025aef4] (cow_file_range) from [8025c1e0] (run_delalloc_range+0x34c/0x380) [8025c1e0] (run_delalloc_range) from [80274d6c] (__extent_writepage+0x708/0x940) [80274d6c] (__extent_writepage) from [802754b4] (extent_writepages+0x238/0x368) [802754b4] (extent_writepages) from [8009b190] (do_writepages+0x24/0x38) [8009b190] (do_writepages) from [800ef59c] (__writeback_single_inode+0x28/0x110) [800ef59c] (__writeback_single_inode) from [800f04c8] (writeback_sb_inodes+0x184/0x38c) [800f04c8] (writeback_sb_inodes) from [800f0740] (__writeback_inodes_wb+0x70/0xac) [800f0740] (__writeback_inodes_wb) from [800f0978] (wb_writeback+0x1fc/0x20c) [800f0978] (wb_writeback) from [800f0b78] (bdi_writeback_workfn+0x144/0x338) [800f0b78] (bdi_writeback_workfn) from [80037cfc] (process_one_work+0x110/0x368) [80037cfc] (process_one_work) from [800383c8] (worker_thread+0x138/0x3e8) [800383c8] (worker_thread) from [8003de90] (kthread+0xcc/0xe8) [8003de90] (kthread) from [8000e238] (ret_from_fork+0x14/0x3c) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html