big volumes only work reliable with ssd_spread
Hello, since around two or three years i'm using btrfs for incremental VM backups. some data: - volume size 60TB - around 2000 subvolumes - each differential backup stacks on top of a subvolume - compress-force=zstd - space_cache=v2 - no quote / qgroup this works fine since Kernel 4.14 except that i need ssd_spread as an option. If i do not use ssd_spread i always end up with very slow performance and a single kworker process using 100% CPU after some days. With ssd_spread those boxes run fine since around 6 month. Is this something expected? I haven't found any hint regarding such an impact. Thanks! Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how to repair or access broken btrfs?
Am 14.11.2017 um 18:45 schrieb Andrei Borzenkov: > 14.11.2017 12:56, Stefan Priebe - Profihost AG пишет: >> Hello, >> >> after a controller firmware bug / failure i've a broken btrfs. >> >> # parent transid verify failed on 181846016 wanted 143404 found 143399 >> >> running repair, fsck or zero-log always results in the same failure message: >> extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered, >> value -1 >> .. stack trace .. >> >> Is there an chance to get at least a single file out of the broken fs? >> > > Did you try "btrfs restore"? Great that worked for that file. Still wondering why a repair is not possible. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
how to repair or access broken btrfs?
Hello, after a controller firmware bug / failure i've a broken btrfs. # parent transid verify failed on 181846016 wanted 143404 found 143399 running repair, fsck or zero-log always results in the same failure message: extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered, value -1 .. stack trace .. Is there an chance to get at least a single file out of the broken fs? Greets, Stefan Complete output: ./btrfs check --repair /dev/mapper/crypt_md0 enabling repair mode parent transid verify failed on 181846016 wanted 143404 found 143399 parent transid verify failed on 181846016 wanted 143404 found 143399 Ignoring transid failure Checking filesystem on /dev/mapper/crypt_md0 UUID: d3f9eee9-efbd-4590-858f-27b39d453350 repair mode will force to clear out log tree, are you sure? [y/N]: y parent transid verify failed on 308183040 wanted 143404 found 143399 parent transid verify failed on 308183040 wanted 143404 found 143399 Ignoring transid failure parent transid verify failed on 338870272 wanted 143404 found 143399 parent transid verify failed on 338870272 wanted 143404 found 143399 Ignoring transid failure parent transid verify failed on 12778157178880 wanted 143404 found 143399 parent transid verify failed on 12778157178880 wanted 143404 found 143399 Ignoring transid failure leaf parent key incorrect 38699008 btrfs unable to find ref byte nr 12778147823616 parent 0 root 2 owner 0 offset 0 parent transid verify failed on 308183040 wanted 143404 found 143399 Ignoring transid failure leaf parent key incorrect 91766784 extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered, value -1 ./btrfs[0x415cb3] ./btrfs[0x416ee5] ./btrfs[0x417104] ./btrfs[0x418cea] ./btrfs[0x418f06] ./btrfs(btrfs_alloc_free_block+0x1e4)[0x41b8d0] ./btrfs(__btrfs_cow_block+0xd3)[0x40c5f9] ./btrfs(btrfs_cow_block+0x110)[0x40d03b] ./btrfs(commit_tree_roots+0x53)[0x439a37] ./btrfs(btrfs_commit_transaction+0xf9)[0x439e02] ./btrfs(cmd_check+0x861)[0x46172e] ./btrfs(main+0x163)[0x40b5e9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f44b14fab45] ./btrfs[0x40b0b9] Aborted -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs-progs: check --repair crashes with BUG ON
Hello, after a power failure i have a btrfs volume which isn't mountable. dmesg shows: parent transid verify failed on 181846016 wanted 143404 found 143399 If i run: btrfs check --repair /dev/mapper/crypt_md1 The output is: parent transid verify failed on 181846016 wanted 143404 found 143399 parent transid verify failed on 181846016 wanted 143404 found 143399 Ignoring transid failure Clearing log on /dev/mapper/crypt_md0, previous log_root 1520200695808, level 0 parent transid verify failed on 308183040 wanted 143404 found 143399 parent transid verify failed on 308183040 wanted 143404 found 143399 Ignoring transid failure parent transid verify failed on 338870272 wanted 143404 found 143399 parent transid verify failed on 338870272 wanted 143404 found 143399 Ignoring transid failure parent transid verify failed on 12778157178880 wanted 143404 found 143399 parent transid verify failed on 12778157178880 wanted 143404 found 143399 Ignoring transid failure leaf parent key incorrect 38699008 btrfs unable to find ref byte nr 12778147823616 parent 0 root 2 owner 0 offset 0 parent transid verify failed on 308183040 wanted 143404 found 143399 Ignoring transid failure leaf parent key incorrect 91766784 extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered, value -1 ./btrfs[0x415cb3] ./btrfs[0x416ee5] ./btrfs[0x417104] ./btrfs[0x418cea] ./btrfs[0x418f06] ./btrfs(btrfs_alloc_free_block+0x1e4)[0x41b8d0] ./btrfs(__btrfs_cow_block+0xd3)[0x40c5f9] ./btrfs(btrfs_cow_block+0x110)[0x40d03b] ./btrfs(commit_tree_roots+0x53)[0x439baa] ./btrfs(btrfs_commit_transaction+0xf9)[0x439f75] ./btrfs[0x467212] ./btrfs(handle_command_group+0x5d)[0x40b360] ./btrfs(cmd_rescue+0x15)[0x46749f] ./btrfs(main+0x163)[0x40b5e9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fc63f25db45] ./btrfs[0x40b0b9] Aborted This is btrfs-progs branch: devel - same happens with master or v4.13.3. Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs is slow while looping in search_bitmap <-btrfs_find_space_for_alloc
Am 05.09.2017 um 07:58 schrieb Stefan Priebe - Profihost AG: > Hello, > > while expecting slow btrfs volumes i switched to kernel v4.13 and to > space_cache=v2. ... > > Is btrfs trying to hard to find free space? Even nobody replied - i reply to myself. I could completely "fix" this by using ssd_spread option for my raid50. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: speed up big btrfs volumes with ssds
Hello, Am 04.09.2017 um 20:32 schrieb Stefan Priebe - Profihost AG: > Am 04.09.2017 um 15:28 schrieb Timofey Titovets: >> 2017-09-04 15:57 GMT+03:00 Stefan Priebe - Profihost AG >> <s.pri...@profihost.ag>: >>> Am 04.09.2017 um 12:53 schrieb Henk Slager: >>>> On Sun, Sep 3, 2017 at 8:32 PM, Stefan Priebe - Profihost AG >>>> <s.pri...@profihost.ag> wrote: >>>>> Hello, >>>>> >>>>> i'm trying to speed up big btrfs volumes. >>>>> >>>>> Some facts: >>>>> - Kernel will be 4.13-rc7 >>>>> - needed volume size is 60TB >>>>> >>>>> Currently without any ssds i get the best speed with: >>>>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices >>>>> >>>>> and using btrfs as raid 0 for data and metadata on top of those 4 raid 5. >>>>> >>>>> I can live with a data loss every now and and than ;-) so a raid 0 on >>>>> top of the 4x radi5 is acceptable for me. >>>>> >>>>> Currently the write speed is not as good as i would like - especially >>>>> for random 8k-16k I/O. >>>>> >>>>> My current idea is to use a pcie flash card with bcache on top of each >>>>> raid 5. >>>> >>>> If it can speed up depends quite a lot on what the use-case is, for >>>> some not-so-much-parallel-access it might work. So this 60TB is then >>>> 20 4TB disks or so and the 4x 1GB cache is simply not very helpful I >>>> think. The working set doesn't fit in it I guess. If there is mostly >>>> single or a few users of the fs, a single pcie based bcacheing 4 >>>> devices can work, but for SATA SSD, I would use 1 SSD per HWraid5. >>> >>> Yes that's roughly my idea as well and yes the workload is 4 users max >>> writing data. 50% sequential, 50% random. >>> >>>> Then roughly make sure the complete set of metadata blocks fits in the >>>> cache. For an fs of this size let's say/estimate 150G. Then maybe same >>>> of double for data, so an SSD of 500G would be a first try. >>> >>> I would use 1TB devices for each Raid or a 4TB PCIe card. >>> >>>> You give the impression that reliability for this fs is not the >>>> highest prio, so if you go full risk, then put bcache in write-back >>>> mode, then you will have your desired random 8k-16k I/O speedup after >>>> the cache is warmed up. But any SW or HW failure wil result in total >>>> fs loss normally if SSD and HDD get out of sync somehow. Bcache >>>> write-through might also be acceptable, you will need extensive >>>> monitoring and tuning of all (bcache) parameters etc to be sure of the >>>> right choice of size and setup etc. >>> >>> Yes i wanted to use the write back mode. Has anybody already made some >>> test or experience with a setup like this? >>> >> >> May be you can make work your raid setup faster by: >> 1. Use Single Profile > > I'm already using the raid0 profile - see below: > > Data,RAID0: Size:22.57TiB, Used:21.08TiB > Metadata,RAID0: Size:90.00GiB, Used:82.28GiB > System,RAID0: Size:64.00MiB, Used:1.53MiB > >> 2. Use different stripe size for HW RAID5: >> i think 16kb will be optimal with 5 devices per raid group >> That will give you 64kb data stripe and 16kb parity >> Btrfs raid0 use 64kb as stripe so that can make data access >> unaligned (or use single profile for btrfs) > > That sounds like an interesting idea except for the unaligned writes. > Will need to test this. > >> 3. Use btrfs ssd_spread to decrease RMW cycles. > Can you explain this? > > Stefan i was able to fix this issue with ssd_spread. Could it be that the default allocators nossd and ssd are searching to hard to free space? Even space_tree did not help. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs is slow while looping in search_bitmap <-btrfs_find_space_for_alloc
Hello, while expecting slow btrfs volumes i switched to kernel v4.13 and to space_cache=v2. But i'm still expecting slow performance and single kworker processes using 100% CPU. Tracing the kworker process shows me: # sed 's/.*: //' /trace | sort | uniq -c | sort -n 21595 tree_search_offset.isra.23 <-btrfs_find_space_for_alloc 21610 btrfs_find_space_for_alloc <-find_free_extent 21619 _raw_spin_lock <-btrfs_find_space_for_alloc 27431 _cond_resched <-find_free_extent 27437 down_read <-find_free_extent 27451 block_group_cache_done.isra.29 <-find_free_extent 27451 btrfs_put_block_group <-find_free_extent 27464 up_read <-find_free_extent 27486 __get_raid_index <-find_free_extent 27503 _raw_spin_lock <-find_free_extent 48335 search_bitmap <-btrfs_find_space_for_alloc Is there anything to optimize? Can i speed up this? There's still plenty of unallocated space: # btrfs fi usage /vmbackup/ Overall: Device size: 58.20TiB Device allocated: 22.66TiB Device unallocated: 35.54TiB Device missing: 0.00B Used: 21.07TiB Free (estimated): 37.12TiB (min: 37.12TiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID0: Size:22.57TiB, Used:20.99TiB /dev/sdc1 5.64TiB /dev/sdd1 5.64TiB /dev/sde1 5.64TiB /dev/sdf1 5.64TiB Metadata,RAID0: Size:90.00GiB, Used:81.60GiB /dev/sdc1 22.50GiB /dev/sdd1 22.50GiB /dev/sde1 22.50GiB /dev/sdf1 22.50GiB System,RAID0: Size:64.00MiB, Used:1.53MiB /dev/sdc1 16.00MiB /dev/sdd1 16.00MiB /dev/sde1 16.00MiB /dev/sdf1 16.00MiB Unallocated: /dev/sdc1 8.88TiB /dev/sdd1 8.88TiB /dev/sde1 8.88TiB /dev/sdf1 8.88TiB Is btrfs trying to hard to find free space? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: speed up big btrfs volumes with ssds
Am 04.09.2017 um 15:28 schrieb Timofey Titovets: > 2017-09-04 15:57 GMT+03:00 Stefan Priebe - Profihost AG > <s.pri...@profihost.ag>: >> Am 04.09.2017 um 12:53 schrieb Henk Slager: >>> On Sun, Sep 3, 2017 at 8:32 PM, Stefan Priebe - Profihost AG >>> <s.pri...@profihost.ag> wrote: >>>> Hello, >>>> >>>> i'm trying to speed up big btrfs volumes. >>>> >>>> Some facts: >>>> - Kernel will be 4.13-rc7 >>>> - needed volume size is 60TB >>>> >>>> Currently without any ssds i get the best speed with: >>>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices >>>> >>>> and using btrfs as raid 0 for data and metadata on top of those 4 raid 5. >>>> >>>> I can live with a data loss every now and and than ;-) so a raid 0 on >>>> top of the 4x radi5 is acceptable for me. >>>> >>>> Currently the write speed is not as good as i would like - especially >>>> for random 8k-16k I/O. >>>> >>>> My current idea is to use a pcie flash card with bcache on top of each >>>> raid 5. >>> >>> If it can speed up depends quite a lot on what the use-case is, for >>> some not-so-much-parallel-access it might work. So this 60TB is then >>> 20 4TB disks or so and the 4x 1GB cache is simply not very helpful I >>> think. The working set doesn't fit in it I guess. If there is mostly >>> single or a few users of the fs, a single pcie based bcacheing 4 >>> devices can work, but for SATA SSD, I would use 1 SSD per HWraid5. >> >> Yes that's roughly my idea as well and yes the workload is 4 users max >> writing data. 50% sequential, 50% random. >> >>> Then roughly make sure the complete set of metadata blocks fits in the >>> cache. For an fs of this size let's say/estimate 150G. Then maybe same >>> of double for data, so an SSD of 500G would be a first try. >> >> I would use 1TB devices for each Raid or a 4TB PCIe card. >> >>> You give the impression that reliability for this fs is not the >>> highest prio, so if you go full risk, then put bcache in write-back >>> mode, then you will have your desired random 8k-16k I/O speedup after >>> the cache is warmed up. But any SW or HW failure wil result in total >>> fs loss normally if SSD and HDD get out of sync somehow. Bcache >>> write-through might also be acceptable, you will need extensive >>> monitoring and tuning of all (bcache) parameters etc to be sure of the >>> right choice of size and setup etc. >> >> Yes i wanted to use the write back mode. Has anybody already made some >> test or experience with a setup like this? >> > > May be you can make work your raid setup faster by: > 1. Use Single Profile I'm already using the raid0 profile - see below: Data,RAID0: Size:22.57TiB, Used:21.08TiB Metadata,RAID0: Size:90.00GiB, Used:82.28GiB System,RAID0: Size:64.00MiB, Used:1.53MiB > 2. Use different stripe size for HW RAID5: > i think 16kb will be optimal with 5 devices per raid group > That will give you 64kb data stripe and 16kb parity > Btrfs raid0 use 64kb as stripe so that can make data access > unaligned (or use single profile for btrfs) That sounds like an interesting idea except for the unaligned writes. Will need to test this. > 3. Use btrfs ssd_spread to decrease RMW cycles. Can you explain this? Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: speed up big btrfs volumes with ssds
Am 04.09.2017 um 12:53 schrieb Henk Slager: > On Sun, Sep 3, 2017 at 8:32 PM, Stefan Priebe - Profihost AG > <s.pri...@profihost.ag> wrote: >> Hello, >> >> i'm trying to speed up big btrfs volumes. >> >> Some facts: >> - Kernel will be 4.13-rc7 >> - needed volume size is 60TB >> >> Currently without any ssds i get the best speed with: >> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices >> >> and using btrfs as raid 0 for data and metadata on top of those 4 raid 5. >> >> I can live with a data loss every now and and than ;-) so a raid 0 on >> top of the 4x radi5 is acceptable for me. >> >> Currently the write speed is not as good as i would like - especially >> for random 8k-16k I/O. >> >> My current idea is to use a pcie flash card with bcache on top of each >> raid 5. > > If it can speed up depends quite a lot on what the use-case is, for > some not-so-much-parallel-access it might work. So this 60TB is then > 20 4TB disks or so and the 4x 1GB cache is simply not very helpful I > think. The working set doesn't fit in it I guess. If there is mostly > single or a few users of the fs, a single pcie based bcacheing 4 > devices can work, but for SATA SSD, I would use 1 SSD per HWraid5. Yes that's roughly my idea as well and yes the workload is 4 users max writing data. 50% sequential, 50% random. > Then roughly make sure the complete set of metadata blocks fits in the > cache. For an fs of this size let's say/estimate 150G. Then maybe same > of double for data, so an SSD of 500G would be a first try. I would use 1TB devices for each Raid or a 4TB PCIe card. > You give the impression that reliability for this fs is not the > highest prio, so if you go full risk, then put bcache in write-back > mode, then you will have your desired random 8k-16k I/O speedup after > the cache is warmed up. But any SW or HW failure wil result in total > fs loss normally if SSD and HDD get out of sync somehow. Bcache > write-through might also be acceptable, you will need extensive > monitoring and tuning of all (bcache) parameters etc to be sure of the > right choice of size and setup etc. Yes i wanted to use the write back mode. Has anybody already made some test or experience with a setup like this? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
speed up big btrfs volumes with ssds
Hello, i'm trying to speed up big btrfs volumes. Some facts: - Kernel will be 4.13-rc7 - needed volume size is 60TB Currently without any ssds i get the best speed with: - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices and using btrfs as raid 0 for data and metadata on top of those 4 raid 5. I can live with a data loss every now and and than ;-) so a raid 0 on top of the 4x radi5 is acceptable for me. Currently the write speed is not as good as i would like - especially for random 8k-16k I/O. My current idea is to use a pcie flash card with bcache on top of each raid 5. Is this something which makes sense to speed up the write speed. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow btrfs with a single kworker process using 100% CPU
n_read <-find_free_extent kworker/u24:4-13405 [003] 344186.202598: block_group_cache_done.isra.27 <-find_free_extent kworker/u24:4-13405 [003] 344186.202598: _raw_spin_lock <-find_free_extent kworker/u24:4-13405 [003] 344186.202598: btrfs_find_space_for_alloc <-find_free_extent kworker/u24:4-13405 [003] 344186.202598: _raw_spin_lock <-btrfs_find_space_for_alloc kworker/u24:4-13405 [003] 344186.202599: tree_search_offset.isra.25 <-btrfs_find_space_for_alloc kworker/u24:4-13405 [003] 344186.202623: __get_raid_index <-find_free_extent kworker/u24:4-13405 [003] 344186.202623: up_read <-find_free_extent kworker/u24:4-13405 [003] 344186.202623: btrfs_put_block_group <-find_free_extent kworker/u24:4-13405 [003] 344186.202623: _cond_resched <-find_free_extent Greets, Stefan Am 20.08.2017 um 13:00 schrieb Stefan Priebe - Profihost AG: > Hello, > > this still happens with space_cache v2. I don't think it is space_cache > related? > > Stefan > > Am 17.08.2017 um 09:43 schrieb Stefan Priebe - Profihost AG: >> while mounting the device the dmesg is full of: >> [ 1320.325147] [] ? kthread_park+0x60/0x60 >> [ 1440.330008] INFO: task btrfs-transacti:3701 blocked for more than 120 >> seconds. >> [ 1440.330014] Not tainted 4.4.82+525-ph #1 >> [ 1440.330015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [ 1440.330020] btrfs-transacti D 88080964fdd8 0 3701 2 >> 0x0008 >> [ 1440.330024] 88080964fdd8 a8e10500 880859d4cb00 >> 88080965 >> [ 1440.330026] 881056069800 88080964fe08 88080a10 >> 88080a100068 >> [ 1440.330028] 88080964fdf0 a86d2b75 880036c92000 >> 88080964fe58 >> [ 1440.330028] Call Trace: >> [ 1440.330053] [] schedule+0x35/0x80 >> [ 1440.330120] [] >> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] >> [ 1440.330159] [] btrfs_commit_transaction+0x3a/0x70 >> [btrfs] >> [ 1440.330186] [] transaction_kthread+0x1d5/0x240 [btrfs] >> [ 1440.330194] [] kthread+0xeb/0x110 >> [ 1440.330200] [] ret_from_fork+0x3f/0x70 >> [ 1440.16] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 >> >> [ 1440.17] Leftover inexact backtrace: >> >> [ 1440.22] [] ? kthread_park+0x60/0x60 >> [ 1560.335839] INFO: task btrfs-transacti:3701 blocked for more than 120 >> seconds. >> [ 1560.335843] Not tainted 4.4.82+525-ph #1 >> [ 1560.335843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [ 1560.335848] btrfs-transacti D 88080964fdd8 0 3701 2 >> 0x0008 >> [ 1560.335852] 88080964fdd8 a8e10500 880859d4cb00 >> 88080965 >> [ 1560.335854] 881056069800 88080964fe08 88080a10 >> 88080a100068 >> [ 1560.335856] 88080964fdf0 a86d2b75 880036c92000 >> 88080964fe58 >> [ 1560.335857] Call Trace: >> [ 1560.335875] [] schedule+0x35/0x80 >> [ 1560.335953] [] >> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] >> [ 1560.335978] [] btrfs_commit_transaction+0x3a/0x70 >> [btrfs] >> [ 1560.335995] [] transaction_kthread+0x1d5/0x240 [btrfs] >> [ 1560.336001] [] kthread+0xeb/0x110 >> [ 1560.336006] [] ret_from_fork+0x3f/0x70 >> [ 1560.337829] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 >> >> [ 1560.337830] Leftover inexact backtrace: >> >> [ 1560.337833] [] ? kthread_park+0x60/0x60 >> [ 1680.341127] INFO: task btrfs-transacti:3701 blocked for more than 120 >> seconds. >> [ 1680.341130] Not tainted 4.4.82+525-ph #1 >> [ 1680.341131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [ 1680.341134] btrfs-transacti D 88080964fdd8 0 3701 2 >> 0x0008 >> [ 1680.341137] 88080964fdd8 a8e10500 880859d4cb00 >> 88080965 >> [ 1680.341138] 881056069800 88080964fe08 88080a10 >> 88080a100068 >> [ 1680.341139] 88080964fdf0 a86d2b75 880036c92000 >> 88080964fe58 >> [ 1680.341140] Call Trace: >> [ 1680.341155] [] schedule+0x35/0x80 >> [ 1680.341211] [] >> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] >> [ 1680.341237] [] btrfs_commit_transaction+0x3a/0x70 >> [btrfs] >> [ 1680.341252] [] transaction_kthread+0x1d5/0x240 [btrfs] >> [ 1680.341258] [] kthread+0xeb/0x110 >> [ 1680.341262] [] ret_from_fork+0x3f/0x70 >> [ 1680.343062] DWARF2 unwinder stuck at ret_from_f
Re: slow btrfs with a single kworker process using 100% CPU
Hello, this still happens with space_cache v2. I don't think it is space_cache related? Stefan Am 17.08.2017 um 09:43 schrieb Stefan Priebe - Profihost AG: > while mounting the device the dmesg is full of: > [ 1320.325147] [] ? kthread_park+0x60/0x60 > [ 1440.330008] INFO: task btrfs-transacti:3701 blocked for more than 120 > seconds. > [ 1440.330014] Not tainted 4.4.82+525-ph #1 > [ 1440.330015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1440.330020] btrfs-transacti D 88080964fdd8 0 3701 2 > 0x0008 > [ 1440.330024] 88080964fdd8 a8e10500 880859d4cb00 > 88080965 > [ 1440.330026] 881056069800 88080964fe08 88080a10 > 88080a100068 > [ 1440.330028] 88080964fdf0 a86d2b75 880036c92000 > 88080964fe58 > [ 1440.330028] Call Trace: > [ 1440.330053] [] schedule+0x35/0x80 > [ 1440.330120] [] > btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] > [ 1440.330159] [] btrfs_commit_transaction+0x3a/0x70 > [btrfs] > [ 1440.330186] [] transaction_kthread+0x1d5/0x240 [btrfs] > [ 1440.330194] [] kthread+0xeb/0x110 > [ 1440.330200] [] ret_from_fork+0x3f/0x70 > [ 1440.16] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 > > [ 1440.17] Leftover inexact backtrace: > > [ 1440.22] [] ? kthread_park+0x60/0x60 > [ 1560.335839] INFO: task btrfs-transacti:3701 blocked for more than 120 > seconds. > [ 1560.335843] Not tainted 4.4.82+525-ph #1 > [ 1560.335843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1560.335848] btrfs-transacti D 88080964fdd8 0 3701 2 > 0x0008 > [ 1560.335852] 88080964fdd8 a8e10500 880859d4cb00 > 88080965 > [ 1560.335854] 881056069800 88080964fe08 88080a10 > 88080a100068 > [ 1560.335856] 88080964fdf0 a86d2b75 880036c92000 > 88080964fe58 > [ 1560.335857] Call Trace: > [ 1560.335875] [] schedule+0x35/0x80 > [ 1560.335953] [] > btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] > [ 1560.335978] [] btrfs_commit_transaction+0x3a/0x70 > [btrfs] > [ 1560.335995] [] transaction_kthread+0x1d5/0x240 [btrfs] > [ 1560.336001] [] kthread+0xeb/0x110 > [ 1560.336006] [] ret_from_fork+0x3f/0x70 > [ 1560.337829] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 > > [ 1560.337830] Leftover inexact backtrace: > > [ 1560.337833] [] ? kthread_park+0x60/0x60 > [ 1680.341127] INFO: task btrfs-transacti:3701 blocked for more than 120 > seconds. > [ 1680.341130] Not tainted 4.4.82+525-ph #1 > [ 1680.341131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1680.341134] btrfs-transacti D 88080964fdd8 0 3701 2 > 0x0008 > [ 1680.341137] 88080964fdd8 a8e10500 880859d4cb00 > 88080965 > [ 1680.341138] 881056069800 88080964fe08 88080a10 > 88080a100068 > [ 1680.341139] 88080964fdf0 a86d2b75 880036c92000 > 88080964fe58 > [ 1680.341140] Call Trace: > [ 1680.341155] [] schedule+0x35/0x80 > [ 1680.341211] [] > btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] > [ 1680.341237] [] btrfs_commit_transaction+0x3a/0x70 > [btrfs] > [ 1680.341252] [] transaction_kthread+0x1d5/0x240 [btrfs] > [ 1680.341258] [] kthread+0xeb/0x110 > [ 1680.341262] [] ret_from_fork+0x3f/0x70 > [ 1680.343062] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 > > Stefan > > Am 17.08.2017 um 07:47 schrieb Stefan Priebe - Profihost AG: >> i've backported the free space cache tree to my kerne and hopefully any >> fixes related to it. >> >> The first mount with clear_cache,space_cache=v2 took around 5 hours. >> >> Currently i do not see any kworker with 100CPU but i don't see much load >> at all. >> >> btrfs-transaction tooks around 2-4% CPU together with a kworker process >> and some 2-3% mdadm processes. I/O Wait is at 3%. >> >> That's it. It does not do much more. Writing a file does not work. >> >> Greets, >> Stefan >> >> Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko: >>> Roman, initially I had a single process occupying 100% CPU, when sysrq it >>> was indicating as "btrfs_find_space_for_alloc" >>> but that's when I used the autodefrag, compress, forcecompress and >>> commit=10 mount flags and space_cache was v1 by default. >>> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% >>> cpu has dissapeared, but the shite performance remained. >>> >>> >>>
Re: slow btrfs with a single kworker process using 100% CPU
while mounting the device the dmesg is full of: [ 1320.325147] [] ? kthread_park+0x60/0x60 [ 1440.330008] INFO: task btrfs-transacti:3701 blocked for more than 120 seconds. [ 1440.330014] Not tainted 4.4.82+525-ph #1 [ 1440.330015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1440.330020] btrfs-transacti D 88080964fdd8 0 3701 2 0x0008 [ 1440.330024] 88080964fdd8 a8e10500 880859d4cb00 88080965 [ 1440.330026] 881056069800 88080964fe08 88080a10 88080a100068 [ 1440.330028] 88080964fdf0 a86d2b75 880036c92000 88080964fe58 [ 1440.330028] Call Trace: [ 1440.330053] [] schedule+0x35/0x80 [ 1440.330120] [] btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] [ 1440.330159] [] btrfs_commit_transaction+0x3a/0x70 [btrfs] [ 1440.330186] [] transaction_kthread+0x1d5/0x240 [btrfs] [ 1440.330194] [] kthread+0xeb/0x110 [ 1440.330200] [] ret_from_fork+0x3f/0x70 [ 1440.16] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 [ 1440.17] Leftover inexact backtrace: [ 1440.22] [] ? kthread_park+0x60/0x60 [ 1560.335839] INFO: task btrfs-transacti:3701 blocked for more than 120 seconds. [ 1560.335843] Not tainted 4.4.82+525-ph #1 [ 1560.335843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1560.335848] btrfs-transacti D 88080964fdd8 0 3701 2 0x0008 [ 1560.335852] 88080964fdd8 a8e10500 880859d4cb00 88080965 [ 1560.335854] 881056069800 88080964fe08 88080a10 88080a100068 [ 1560.335856] 88080964fdf0 a86d2b75 880036c92000 88080964fe58 [ 1560.335857] Call Trace: [ 1560.335875] [] schedule+0x35/0x80 [ 1560.335953] [] btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] [ 1560.335978] [] btrfs_commit_transaction+0x3a/0x70 [btrfs] [ 1560.335995] [] transaction_kthread+0x1d5/0x240 [btrfs] [ 1560.336001] [] kthread+0xeb/0x110 [ 1560.336006] [] ret_from_fork+0x3f/0x70 [ 1560.337829] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 [ 1560.337830] Leftover inexact backtrace: [ 1560.337833] [] ? kthread_park+0x60/0x60 [ 1680.341127] INFO: task btrfs-transacti:3701 blocked for more than 120 seconds. [ 1680.341130] Not tainted 4.4.82+525-ph #1 [ 1680.341131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.341134] btrfs-transacti D 88080964fdd8 0 3701 2 0x0008 [ 1680.341137] 88080964fdd8 a8e10500 880859d4cb00 88080965 [ 1680.341138] 881056069800 88080964fe08 88080a10 88080a100068 [ 1680.341139] 88080964fdf0 a86d2b75 880036c92000 88080964fe58 [ 1680.341140] Call Trace: [ 1680.341155] [] schedule+0x35/0x80 [ 1680.341211] [] btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs] [ 1680.341237] [] btrfs_commit_transaction+0x3a/0x70 [btrfs] [ 1680.341252] [] transaction_kthread+0x1d5/0x240 [btrfs] [ 1680.341258] [] kthread+0xeb/0x110 [ 1680.341262] [] ret_from_fork+0x3f/0x70 [ 1680.343062] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 Stefan Am 17.08.2017 um 07:47 schrieb Stefan Priebe - Profihost AG: > i've backported the free space cache tree to my kerne and hopefully any > fixes related to it. > > The first mount with clear_cache,space_cache=v2 took around 5 hours. > > Currently i do not see any kworker with 100CPU but i don't see much load > at all. > > btrfs-transaction tooks around 2-4% CPU together with a kworker process > and some 2-3% mdadm processes. I/O Wait is at 3%. > > That's it. It does not do much more. Writing a file does not work. > > Greets, > Stefan > > Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko: >> Roman, initially I had a single process occupying 100% CPU, when sysrq it >> was indicating as "btrfs_find_space_for_alloc" >> but that's when I used the autodefrag, compress, forcecompress and commit=10 >> mount flags and space_cache was v1 by default. >> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% >> cpu has dissapeared, but the shite performance remained. >> >> >> As to the chunk size, there is no information in the article about the type >> of data that was used. While in our case we are pretty certain about the >> compressed block size (32-128). I am currently inclining towards 32k as it >> might be ideal in a situation when we have a 5 disk raid5 array. >> >> In theory >> 1. The minimum compressed write (32k) would fill the chunk on a single disk, >> thus the IO cost of the operation would be 2 reads (original chunk + >> original parity) and 2 writes (new chunk + new parity) >> >> 2. The maximum compressed write (128k) would require the update of 1 chunk &
Re: slow btrfs with a single kworker process using 100% CPU
i've backported the free space cache tree to my kerne and hopefully any fixes related to it. The first mount with clear_cache,space_cache=v2 took around 5 hours. Currently i do not see any kworker with 100CPU but i don't see much load at all. btrfs-transaction tooks around 2-4% CPU together with a kworker process and some 2-3% mdadm processes. I/O Wait is at 3%. That's it. It does not do much more. Writing a file does not work. Greets, Stefan Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko: > Roman, initially I had a single process occupying 100% CPU, when sysrq it was > indicating as "btrfs_find_space_for_alloc" > but that's when I used the autodefrag, compress, forcecompress and commit=10 > mount flags and space_cache was v1 by default. > when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu > has dissapeared, but the shite performance remained. > > > As to the chunk size, there is no information in the article about the type > of data that was used. While in our case we are pretty certain about the > compressed block size (32-128). I am currently inclining towards 32k as it > might be ideal in a situation when we have a 5 disk raid5 array. > > In theory > 1. The minimum compressed write (32k) would fill the chunk on a single disk, > thus the IO cost of the operation would be 2 reads (original chunk + original > parity) and 2 writes (new chunk + new parity) > > 2. The maximum compressed write (128k) would require the update of 1 chunk on > each of the 4 data disks + 1 parity write > > > > Stefan what mount flags do you use? > > kos > > > > - Original Message - > From: "Roman Mamedov" <r...@romanrm.net> > To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> > Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" > <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" > <p...@btrfs.list.sabi.co.uk> > Sent: Wednesday, 16 August, 2017 2:00:03 PM > Subject: Re: slow btrfs with a single kworker process using 100% CPU > > On Wed, 16 Aug 2017 12:48:42 +0100 (BST) > "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote: > >> I believe the chunk size of 512kb is even worth for performance then the >> default settings on my HW RAID of 256kb. > > It might be, but that does not explain the original problem reported at all. > If mdraid performance would be the bottleneck, you would see high iowait, > possibly some CPU load from the mdX_raidY threads. But not a single Btrfs > thread pegging into 100% CPU. > >> So now I am moving the data from the array and will be rebuilding it with 64 >> or 32 chunk size and checking the performance. > > 64K is the sweet spot for RAID5/6: > http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow btrfs with a single kworker process using 100% CPU
Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko: > Roman, initially I had a single process occupying 100% CPU, when sysrq it was > indicating as "btrfs_find_space_for_alloc" > but that's when I used the autodefrag, compress, forcecompress and commit=10 > mount flags and space_cache was v1 by default. > when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu > has dissapeared, but the shite performance remained. space_cache=v2 is not supported by the opensuse kernel - but as i compile the kernel myself anyway. Is there a patchset to add support for space_cache=v2? Greets, Stefan > > As to the chunk size, there is no information in the article about the type > of data that was used. While in our case we are pretty certain about the > compressed block size (32-128). I am currently inclining towards 32k as it > might be ideal in a situation when we have a 5 disk raid5 array. > > In theory > 1. The minimum compressed write (32k) would fill the chunk on a single disk, > thus the IO cost of the operation would be 2 reads (original chunk + original > parity) and 2 writes (new chunk + new parity) > > 2. The maximum compressed write (128k) would require the update of 1 chunk on > each of the 4 data disks + 1 parity write > > > > Stefan what mount flags do you use? > > kos > > > > - Original Message - > From: "Roman Mamedov" <r...@romanrm.net> > To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> > Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" > <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" > <p...@btrfs.list.sabi.co.uk> > Sent: Wednesday, 16 August, 2017 2:00:03 PM > Subject: Re: slow btrfs with a single kworker process using 100% CPU > > On Wed, 16 Aug 2017 12:48:42 +0100 (BST) > "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote: > >> I believe the chunk size of 512kb is even worth for performance then the >> default settings on my HW RAID of 256kb. > > It might be, but that does not explain the original problem reported at all. > If mdraid performance would be the bottleneck, you would see high iowait, > possibly some CPU load from the mdX_raidY threads. But not a single Btrfs > thread pegging into 100% CPU. > >> So now I am moving the data from the array and will be rebuilding it with 64 >> or 32 chunk size and checking the performance. > > 64K is the sweet spot for RAID5/6: > http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow btrfs with a single kworker process using 100% CPU
Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko: > Roman, initially I had a single process occupying 100% CPU, when sysrq it was > indicating as "btrfs_find_space_for_alloc" > but that's when I used the autodefrag, compress, forcecompress and commit=10 > mount flags and space_cache was v1 by default. > when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu > has dissapeared, but the shite performance remained. > > > As to the chunk size, there is no information in the article about the type > of data that was used. While in our case we are pretty certain about the > compressed block size (32-128). I am currently inclining towards 32k as it > might be ideal in a situation when we have a 5 disk raid5 array. > > In theory > 1. The minimum compressed write (32k) would fill the chunk on a single disk, > thus the IO cost of the operation would be 2 reads (original chunk + original > parity) and 2 writes (new chunk + new parity) > > 2. The maximum compressed write (128k) would require the update of 1 chunk on > each of the 4 data disks + 1 parity write > > > > Stefan what mount flags do you use? noatime,compress-force=zlib,noacl,space_cache,skip_balance,subvolid=5,subvol=/ Greets, Stefan > kos > > > > - Original Message - > From: "Roman Mamedov" <r...@romanrm.net> > To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> > Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" > <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" > <p...@btrfs.list.sabi.co.uk> > Sent: Wednesday, 16 August, 2017 2:00:03 PM > Subject: Re: slow btrfs with a single kworker process using 100% CPU > > On Wed, 16 Aug 2017 12:48:42 +0100 (BST) > "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote: > >> I believe the chunk size of 512kb is even worth for performance then the >> default settings on my HW RAID of 256kb. > > It might be, but that does not explain the original problem reported at all. > If mdraid performance would be the bottleneck, you would see high iowait, > possibly some CPU load from the mdX_raidY threads. But not a single Btrfs > thread pegging into 100% CPU. > >> So now I am moving the data from the array and will be rebuilding it with 64 >> or 32 chunk size and checking the performance. > > 64K is the sweet spot for RAID5/6: > http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow btrfs with a single kworker process using 100% CPU
Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko: > Roman, initially I had a single process occupying 100% CPU, when sysrq it was > indicating as "btrfs_find_space_for_alloc" > but that's when I used the autodefrag, compress, forcecompress and commit=10 > mount flags and space_cache was v1 by default. > when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu > has dissapeared, but the shite performance remained. > > > As to the chunk size, there is no information in the article about the type > of data that was used. While in our case we are pretty certain about the > compressed block size (32-128). I am currently inclining towards 32k as it > might be ideal in a situation when we have a 5 disk raid5 array. > > In theory > 1. The minimum compressed write (32k) would fill the chunk on a single disk, > thus the IO cost of the operation would be 2 reads (original chunk + original > parity) and 2 writes (new chunk + new parity) > > 2. The maximum compressed write (128k) would require the update of 1 chunk on > each of the 4 data disks + 1 parity write > > > > Stefan what mount flags do you use? noatime,compress-force=zlib,noacl,space_cache,skip_balance,subvolid=5,subvol=/ Greets, Stefan > kos > > > > - Original Message - > From: "Roman Mamedov" <r...@romanrm.net> > To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> > Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" > <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" > <p...@btrfs.list.sabi.co.uk> > Sent: Wednesday, 16 August, 2017 2:00:03 PM > Subject: Re: slow btrfs with a single kworker process using 100% CPU > > On Wed, 16 Aug 2017 12:48:42 +0100 (BST) > "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote: > >> I believe the chunk size of 512kb is even worth for performance then the >> default settings on my HW RAID of 256kb. > > It might be, but that does not explain the original problem reported at all. > If mdraid performance would be the bottleneck, you would see high iowait, > possibly some CPU load from the mdX_raidY threads. But not a single Btrfs > thread pegging into 100% CPU. > >> So now I am moving the data from the array and will be rebuilding it with 64 >> or 32 chunk size and checking the performance. > > 64K is the sweet spot for RAID5/6: > http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow btrfs with a single kworker process using 100% CPU
Am 16.08.2017 um 11:02 schrieb Konstantin V. Gavrilenko: > Could be similar issue as what I had recently, with the RAID5 and 256kb chunk > size. > please provide more information about your RAID setup. Hope this helps: # cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] md0 : active raid5 sdd1[1] sdf1[4] sdc1[0] sde1[2] 11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [] bitmap: 6/30 pages [24KB], 65536KB chunk md2 : active raid5 sdm1[2] sdl1[1] sdk1[0] sdn1[4] 11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [] bitmap: 7/30 pages [28KB], 65536KB chunk md1 : active raid5 sdi1[2] sdg1[0] sdj1[4] sdh1[1] 11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [] bitmap: 7/30 pages [28KB], 65536KB chunk md3 : active raid5 sdp1[1] sdo1[0] sdq1[2] sdr1[4] 11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [] bitmap: 6/30 pages [24KB], 65536KB chunk # btrfs fi usage /vmbackup/ Overall: Device size: 43.65TiB Device allocated: 31.98TiB Device unallocated: 11.67TiB Device missing: 0.00B Used: 30.80TiB Free (estimated): 12.84TiB (min: 12.84TiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID0: Size:31.83TiB, Used:30.66TiB /dev/md07.96TiB /dev/md17.96TiB /dev/md27.96TiB /dev/md37.96TiB Metadata,RAID0: Size:153.00GiB, Used:141.34GiB /dev/md0 38.25GiB /dev/md1 38.25GiB /dev/md2 38.25GiB /dev/md3 38.25GiB System,RAID0: Size:128.00MiB, Used:2.28MiB /dev/md0 32.00MiB /dev/md1 32.00MiB /dev/md2 32.00MiB /dev/md3 32.00MiB Unallocated: /dev/md02.92TiB /dev/md12.92TiB /dev/md22.92TiB /dev/md32.92TiB Stefan > > p.s. > you can also check the tread "Btrfs + compression = slow performance and high > cpu usage" > > - Original Message - > From: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag> > To: "Marat Khalili" <m...@rqc.ru>, linux-btrfs@vger.kernel.org > Sent: Wednesday, 16 August, 2017 10:37:43 AM > Subject: Re: slow btrfs with a single kworker process using 100% CPU > > Am 16.08.2017 um 08:53 schrieb Marat Khalili: >>> I've one system where a single kworker process is using 100% CPU >>> sometimes a second process comes up with 100% CPU [btrfs-transacti]. Is >>> there anything i can do to get the old speed again or find the culprit? >> >> 1. Do you use quotas (qgroups)? > > No qgroups and no quota. > >> 2. Do you have a lot of snapshots? Have you deleted some recently? > > 1413 Snapshots. I'm deleting 50 of them every night. But btrfs-cleaner > process isn't running / consuming CPU currently. > >> More info about your system would help too. > Kernel is OpenSuSE Leap 42.3. > > btrfs is mounted with > compress-force=zlib > > btrfs is running as a raid0 on top of 4 md raid 5 devices. > > Greets, > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow btrfs with a single kworker process using 100% CPU
Am 16.08.2017 um 08:53 schrieb Marat Khalili: >> I've one system where a single kworker process is using 100% CPU >> sometimes a second process comes up with 100% CPU [btrfs-transacti]. Is >> there anything i can do to get the old speed again or find the culprit? > > 1. Do you use quotas (qgroups)? No qgroups and no quota. > 2. Do you have a lot of snapshots? Have you deleted some recently? 1413 Snapshots. I'm deleting 50 of them every night. But btrfs-cleaner process isn't running / consuming CPU currently. > More info about your system would help too. Kernel is OpenSuSE Leap 42.3. btrfs is mounted with compress-force=zlib btrfs is running as a raid0 on top of 4 md raid 5 devices. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
slow btrfs with a single kworker process using 100% CPU
Hello, I've one system where a single kworker process is using 100% CPU sometimes a second process comes up with 100% CPU [btrfs-transacti]. Is there anything i can do to get the old speed again or find the culprit? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: runtime btrfsck
Hello, thanks. But is there any way to recover from this error? Like removing the item or so? Data loss isn't a problem. Just reconstructing the whole FS will take quite a long time. Stefan Am 10.05.2017 um 11:54 schrieb Hugo Mills: > On Wed, May 10, 2017 at 11:20:58AM +0200, Stefan Priebe - Profihost AG wrote: >> Hello, >> >> here's the output: >> # for block in 163316514816 163322413056 163325722624; do echo $block; >> btrfs-debug-tree -b $block /dev/mapper/crypt_md0|sed -re 's/(\t| )name: >> .*/\1name: HIDDEN/'; done >> >> 163316514816 >> btrfs-progs v4.8.5 >> leaf 163316514816 items 188 free space 1387 generation 86739 owner 3892 >> fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 >> chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f > [...] >> item 37 key (23760 DIR_INDEX 36) itemoff 14278 itemsize 58 >> location key (28124232 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 28 >> name: HIDDEN >> item 38 key (23760 DIR_INDEX 37) itemoff 14220 itemsize 58 >> location key (28124233 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 28 >> name: HIDDEN >> item 39 key (23760 DIR_INDEX 38) itemoff 14165 itemsize 55 >> location key (28124234 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 25 >> name: HIDDEN >> item 40 key (23760 DIR_INDEX 22) itemoff 14115 itemsize 50 >> location key (26923383 INODE_ITEM 0) type FILE >> transid 74009 data_len 0 name_len 20 >> name: HIDDEN >> item 41 key (23760 DIR_INDEX 23) itemoff 14067 itemsize 48 >> location key (26923384 INODE_ITEM 0) type FILE >> transid 74009 data_len 0 name_len 18 >> name: HIDDEN > [...] >> 163322413056 >> btrfs-progs v4.8.5 >> leaf 163322413056 items 113 free space 934 generation 86739 owner 3892 >> fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 >> chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f > [...] >> item 73 key (24016 DIR_INDEX 19) itemoff 9651 itemsize 62 >> location key (28124251 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 32 >> name: HIDDEN >> item 74 key (24016 DIR_INDEX 20) itemoff 9592 itemsize 59 >> location key (28124252 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 29 >> name: HIDDEN >> item 75 key (24016 DIR_INDEX 4) itemoff 9538 itemsize 54 >> location key (26923401 INODE_ITEM 0) type FILE >> transid 74009 data_len 0 name_len 24 >> name: HIDDEN >> item 76 key (24016 DIR_INDEX 5) itemoff 9486 itemsize 52 >> location key (26923402 INODE_ITEM 0) type FILE >> transid 74009 data_len 0 name_len 22 >> name: HIDDEN > [...] >> 163325722624 >> btrfs-progs v4.8.5 >> leaf 163325722624 items 78 free space 6563 generation 86739 owner 3892 >> fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 >> chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f > [...] >> item 62 key (24189 DIR_INDEX 16) itemoff 9409 itemsize 64 >> location key (28124267 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 34 >> name: HIDDEN >> item 63 key (24189 DIR_INDEX 17) itemoff 9349 itemsize 60 >> location key (28124268 INODE_ITEM 0) type FILE >> transid 86739 data_len 0 name_len 30 >> name: HIDDEN >> item 64 key (24189 DIR_INDEX 4) itemoff 9296 itemsize 53 >> location key (26923421 INODE_ITEM 0) type FILE >> transid 74010 data_len 0 name_len 23 >> name: HIDDEN >> item 65 key (24189 DIR_INDEX 5) itemoff 9236 itemsize 60 >> location key (26923422 INODE_ITEM 0) type FILE >> transid 74010 data_len 0 name_len 30 >> name: HIDDEN > [...] > >In each case, the tree node keys have simply been sorted > incorrectly -- the ordering is otherwise correct, but jumps backwards > at some point in the sequence. At least in the first instance, some of > the keys appear to have been duplicated: there are two (23760 > DIR_INDEX 22) keys in the list. (I didn't check in detail with the > other two whether there are duplicates, but I would
Re: runtime btrfsck
Hello, here's the output: # for block in 163316514816 163322413056 163325722624; do echo $block; btrfs-debug-tree -b $block /dev/mapper/crypt_md0|sed -re 's/(\t| )name: .*/\1name: HIDDEN/'; done 163316514816 btrfs-progs v4.8.5 leaf 163316514816 items 188 free space 1387 generation 86739 owner 3892 fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f item 0 key (23760 DIR_ITEM 2479948887) itemoff 16229 itemsize 54 location key (26923382 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 24 name: HIDDEN item 1 key (23760 DIR_ITEM 2652742785) itemoff 16170 itemsize 59 location key (28124230 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 29 name: HIDDEN item 2 key (23760 DIR_ITEM 2688971413) itemoff 16119 itemsize 51 location key (26923386 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 21 name: HIDDEN item 3 key (23760 DIR_ITEM 2764880658) itemoff 16064 itemsize 55 location key (26923399 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 25 name: HIDDEN item 4 key (23760 DIR_ITEM 2805527189) itemoff 16006 itemsize 58 location key (28124233 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 28 name: HIDDEN item 5 key (23760 DIR_ITEM 2876464375) itemoff 15957 itemsize 49 location key (26923393 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 19 name: HIDDEN item 6 key (23760 DIR_ITEM 2951059296) itemoff 15907 itemsize 50 location key (28124218 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 20 name: HIDDEN item 7 key (23760 DIR_ITEM 3058144963) itemoff 15859 itemsize 48 location key (26923384 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 18 name: HIDDEN item 8 key (23760 DIR_ITEM 3095440808) itemoff 15804 itemsize 55 location key (26923394 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 25 name: HIDDEN item 9 key (23760 DIR_ITEM 3124573416) itemoff 15748 itemsize 56 location key (26923387 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 26 name: HIDDEN item 10 key (23760 DIR_ITEM 3194204932) itemoff 15690 itemsize 58 location key (26923397 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 28 name: HIDDEN item 11 key (23760 DIR_ITEM 3281114395) itemoff 15637 itemsize 53 location key (26923388 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 23 name: HIDDEN item 12 key (23760 DIR_ITEM 3353597736) itemoff 15588 itemsize 49 location key (24944 INODE_ITEM 0) type FILE transid 10694 data_len 0 name_len 19 name: HIDDEN item 13 key (23760 DIR_ITEM 3389003195) itemoff 15539 itemsize 49 location key (28124226 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 19 name: HIDDEN item 14 key (23760 DIR_ITEM 3461310858) itemoff 15473 itemsize 66 location key (26923392 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 36 name: HIDDEN item 15 key (23760 DIR_ITEM 3660173809) itemoff 15422 itemsize 51 location key (28124225 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 21 name: HIDDEN item 16 key (23760 DIR_ITEM 3678308711) itemoff 15371 itemsize 51 location key (28124220 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 21 name: HIDDEN item 17 key (23760 DIR_ITEM 3708519009) itemoff 15316 itemsize 55 location key (28124224 INODE_ITEM 0) type FILE transid 86739 data_len 0 name_len 25 name: HIDDEN item 18 key (23760 DIR_ITEM 3716314603) itemoff 15258 itemsize 58 location key (26923396 INODE_ITEM 0) type FILE transid 74009 data_len 0 name_len 28 name: HIDDEN item 19 key (23760 DIR_ITEM 3958443109) itemoff 15224 itemsize 34 location key (24016 INODE_ITEM 0) type DIR transid 10693 data_len 0 name_len 4 name: HIDDEN item 20 key (23760 DIR_INDEX 2) itemoff 15190 itemsize 34 location key (24016 INODE_ITEM 0) type DIR transid 10693 data_len 0 name_len 4 name: HIDDEN item 21 key (23760 DIR_INDEX
Re: runtime btrfsck
Hi, Am 10.05.2017 um 09:48 schrieb Martin Steigerwald: > Stefan Priebe - Profihost AG - 10.05.17, 09:02: >> I'm now trying btrfs progs 4.10.2. Is anybody out there who can tell me >> something about the expected runtime or how to fix bad key ordering? > > I had a similar issue which remained unresolved. > But I clearly saw that btrfs check was running in a loop, see thread: > [4.9] btrfs check --repair looping over file extent discount errors > > So it would be interesting to see the exact output of btrfs check, maybe > there > is something like repeated numbers that also indicate a loop. Output is just: enabling repair mode Checking filesystem on /dev/mapper/crypt_md0 UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 bad key ordering 39 40 checking extents [.] even after 2,5 weeks running. Stefan > I was about to say that BTRFS is production ready before this issue happened. > I still think it for a lot of setup mostly is, as at least the "I get stuck > on > the CPU while searching for free space" issue seems to be gone since about > anything between 4.5/4.6 kernels. I also think so regarding absence of data > loss. I was able to copy over all of the data I needed of the broken > filesystem. > > Yet, when it comes to btrfs check? Its still quite rudimentary if you ask me. > > So unless someone has a clever idea here and shares it with you, it may be > needed to backup anything you can from this filesystem and then start over > from > scratch. As to my past experience something like xfs_repair surpasses btrfs > check in the ability to actually fix broken filesystem by a great extent. > > Ciao, > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: runtime btrfsck
Am 10.05.2017 um 09:40 schrieb Hugo Mills: > On Wed, May 10, 2017 at 09:36:30AM +0200, Stefan Priebe - Profihost AG wrote: >> Hello Roman, >> >> the FS is mountable. It just goes readonly when trying to write some data. >> >> The kernel msgs are: >> BTRFS critical (device dm-2): corrupt leaf, bad key order: >> block=163316514816,root=1, slot=39 >> BTRFS critical (device dm-2): corrupt leaf, bad key order: >> block=163322413056,root=1, slot=74 >> BTRFS critical (device dm-2): corrupt leaf, bad key order: >> block=163325722624,root=1, slot=63 >> BTRFS critical (device dm-2): corrupt leaf, bad key order: >> block=163316514816,root=1, slot=39 >> BTRFS: error (device dm-2) in btrfs_drop_snapshot:8839: errno=-5 IO failure >> BTRFS info (device dm-2): forced readonly >> BTRFS info (device dm-2): delayed_refs has NO entry > >Can you show the output of btrfs-debug-tree -b , where > is each of the three "block=" values above? Can do that. But the lists are very long - should i upload them to pastebin? Is it ok to remove the name atribute - which provides filenames? Stefan >Hugo. > >> Greets, >> Stefan >> Am 10.05.2017 um 09:18 schrieb Roman Mamedov: >>> On Wed, 10 May 2017 09:02:46 +0200 >>> Stefan Priebe - Profihost AG <s.pri...@profihost.ag> wrote: >>> >>>> how to fix bad key ordering? >>> >>> You should clarify does the FS in question mount (read-write? read-only?) >>> and what are the kernel messages if it does not. >>> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: runtime btrfsck
Hello Roman, the FS is mountable. It just goes readonly when trying to write some data. The kernel msgs are: BTRFS critical (device dm-2): corrupt leaf, bad key order: block=163316514816,root=1, slot=39 BTRFS critical (device dm-2): corrupt leaf, bad key order: block=163322413056,root=1, slot=74 BTRFS critical (device dm-2): corrupt leaf, bad key order: block=163325722624,root=1, slot=63 BTRFS critical (device dm-2): corrupt leaf, bad key order: block=163316514816,root=1, slot=39 BTRFS: error (device dm-2) in btrfs_drop_snapshot:8839: errno=-5 IO failure BTRFS info (device dm-2): forced readonly BTRFS info (device dm-2): delayed_refs has NO entry Greets, Stefan Am 10.05.2017 um 09:18 schrieb Roman Mamedov: > On Wed, 10 May 2017 09:02:46 +0200 > Stefan Priebe - Profihost AG <s.pri...@profihost.ag> wrote: > >> how to fix bad key ordering? > > You should clarify does the FS in question mount (read-write? read-only?) > and what are the kernel messages if it does not. > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: runtime btrfsck
I'm now trying btrfs progs 4.10.2. Is anybody out there who can tell me something about the expected runtime or how to fix bad key ordering? Greets, Stefan Am 06.05.2017 um 07:56 schrieb Stefan Priebe - Profihost AG: > It's still running. Is this the normal behaviour? Is there any other way > to fix the bad key ordering? > > Greets, > Stefan > > Am 02.05.2017 um 08:29 schrieb Stefan Priebe - Profihost AG: >> Hello list, >> >> i wanted to check an fs cause it has bad key ordering. >> >> But btrfscheck is now running since 7 days. Current output: >> # btrfsck -p --repair /dev/mapper/crypt_md0 >> enabling repair mode >> Checking filesystem on /dev/mapper/crypt_md0 >> UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 >> bad key ordering 39 40 >> checking extents [O] >> >> FS is a 12TB BTRFS Raid 0 over 3 mdadm Raid 5 devices. How long should >> btrfsck run and is there any way to speed it up? btrfs tools is 4.8.5 >> >> Thanks! >> >> Greets, >> Stefan >> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: runtime btrfsck
It's still running. Is this the normal behaviour? Is there any other way to fix the bad key ordering? Greets, Stefan Am 02.05.2017 um 08:29 schrieb Stefan Priebe - Profihost AG: > Hello list, > > i wanted to check an fs cause it has bad key ordering. > > But btrfscheck is now running since 7 days. Current output: > # btrfsck -p --repair /dev/mapper/crypt_md0 > enabling repair mode > Checking filesystem on /dev/mapper/crypt_md0 > UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 > bad key ordering 39 40 > checking extents [O] > > FS is a 12TB BTRFS Raid 0 over 3 mdadm Raid 5 devices. How long should > btrfsck run and is there any way to speed it up? btrfs tools is 4.8.5 > > Thanks! > > Greets, > Stefan > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
runtime btrfsck
Hello list, i wanted to check an fs cause it has bad key ordering. But btrfscheck is now running since 7 days. Current output: # btrfsck -p --repair /dev/mapper/crypt_md0 enabling repair mode Checking filesystem on /dev/mapper/crypt_md0 UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9 bad key ordering 39 40 checking extents [O] FS is a 12TB BTRFS Raid 0 over 3 mdadm Raid 5 devices. How long should btrfsck run and is there any way to speed it up? btrfs tools is 4.8.5 Thanks! Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues
Hello Qu, still noone on this one? Or is this one solved in another way in 4.10 or 4.11 or is compression just experimental? Haven't seen a note on this. Thanks, Stefan Am 27.02.2017 um 14:43 schrieb Stefan Priebe - Profihost AG: > Hi, > > can please anybody comment on that one? Josef? Chris? I still need those > patches to be able to let btrfs run for more than 24hours without ENOSPC > issues. > > Greets, > Stefan > > Am 27.02.2017 um 08:22 schrieb Qu Wenruo: >> >> >> At 02/25/2017 04:23 PM, Stefan Priebe - Profihost AG wrote: >>> Dear Qu, >>> >>> any news on your branch? I still don't see it merged anywhere. >>> >>> Greets, >>> Stefan >> >> I just remember that Liu Bo has commented one of the patches, I'm afraid >> I can only push these patches until I addressed his concern. >> >> I'll start digging it as memory for this fix is quite blurred now. >> >> Thanks, >> Qu >>> >>> Am 04.01.2017 um 17:13 schrieb Stefan Priebe - Profihost AG: >>>> Hi Qu, >>>> >>>> Am 01.01.2017 um 10:32 schrieb Qu Wenruo: >>>>> Hi Stefan, >>>>> >>>>> I'm trying to push it to for-next (will be v4.11), but no response yet. >>>>> >>>>> It would be quite nice for you to test the following git pull and give >>>>> some feedback, so that we can merge it faster. >>>>> >>>>> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html >>>> >>>> I'm also getting a notifier that wang's email does not exist anymore >>>> (wangxg.f...@cn.fujitsu.com). >>>> >>>> I would like to test that branch will need some time todo so. Last time >>>> i tried 4.10-rc1 i had the same problems like this guy: >>>> https://www.marc.info/?l=linux-btrfs=148338312525137=2 >>>> >>>> Stefan >>>> >>>>> Thanks, >>>>> Qu >>>>> >>>>> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote: >>>>>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc >>>>>> >>>>>> Stefan >>>>>> >>>>>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang: >>>>>>> When having compression enabled, Stefan Priebe ofen got enospc errors >>>>>>> though fs still has much free space. Qu Wenruo also has submitted a >>>>>>> fstests test case which can reproduce this bug steadily, please see >>>>>>> url: https://patchwork.kernel.org/patch/9420527 >>>>>>> >>>>>>> First patch[1/3] "btrfs: improve inode's outstanding_extents >>>>>>> computation" is to >>>>>>> fix outstanding_extents and reserved_extents account issues. This >>>>>>> issue was revealed >>>>>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying >>>>>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these >>>>>>> warnings from >>>>>>> btrfs_destroy_inode(): >>>>>>> WARN_ON(BTRFS_I(inode)->outstanding_extents); >>>>>>> WARN_ON(BTRFS_I(inode)->reserved_extents); >>>>>>> Please see this patch's commit message for detailed info, and this >>>>>>> patch is >>>>>>> necessary to patch2 and patch3. >>>>>>> >>>>>>> For false enospc, the root reasson is that for compression, its max >>>>>>> extent size will >>>>>>> be 128k, not 128MB. If we still use 128MB as max extent size to >>>>>>> reserve metadata for >>>>>>> compression, obviously it's not appropriate. In patch "btrfs: >>>>>>> Introduce COMPRESS >>>>>>> reserve type to fix false enospc for compression" commit message, >>>>>>> we explain why false enospc error occurs, please see it for detailed >>>>>>> info. >>>>>>> >>>>>>> To fix this issue, we introduce a new enum type: >>>>>>> enum btrfs_metadata_reserve_type { >>>>>>> BTRFS_RESERVE_NORMAL, >>>>>>> BTRFS_RESERVE_COMPRESS, >>>>>>> }; >>>>>>> For btrfs_delalloc_[reserve|release]_metadata() and >>>>>>
Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error
Thanks Qu, removing BTRFS_I from the inode fixes this issue to me. Greets, Stefan Am 14.03.2017 um 03:50 schrieb Qu Wenruo: > > > At 03/13/2017 09:26 PM, Stefan Priebe - Profihost AG wrote: >> >> Am 13.03.2017 um 08:39 schrieb Qu Wenruo: >>> >>> >>> At 03/13/2017 03:26 PM, Stefan Priebe - Profihost AG wrote: >>>> Hi Qu, >>>> >>>> Am 13.03.2017 um 02:16 schrieb Qu Wenruo: >>>> >>>> But wasn't this part of the code identical in V5? Why does it only >>>> happen with V7? >>> >>> There are still difference, but just as you said, the related >>> part(checking if inode is free space cache inode) is identical across v5 >>> and v7. >> >> But if i boot v7 it always happens. If i boot v5 it always works. Have >> done 5 repeatet tests. > > I rechecked the code change between v7 and v5. > > It turns out that, the code base may cause the problem. > > In v7, the base is v4.11-rc1, which introduced quite a lot of > btrfs_inode cleanup. > > One of the difference is the parameter for btrfs_is_free_space_inode(). > > In v7, the parameter @inode changed from struct inode to struct > btrfs_inode. > > So in v7, we're passing BTRFS_I(inode) to btrfs_is_free_space_inode(), > other than plain inode. > > That's the most possible cause for me here. > > So would you please paste the final patch applied to your tree? > Git diff or git format-patch can both handle it. > > Thanks, > Qu > >> >>> I'm afraid that's a rare race leading to NULL btrfs_inode->root, which >>> could happen in both v5 and v7. >>> >>> What's the difference between SUSE and mainline kernel? >> >> A lot ;-) But i don't think anything related. >> >>> Maybe some mainline kernel commits have already fixed it? >> >> May be no idea. But i haven't found any reason why v5 works. >> >> Stefan >> >>> >>> Thanks, >>> Qu >>>> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error
Am 13.03.2017 um 08:39 schrieb Qu Wenruo: > > > At 03/13/2017 03:26 PM, Stefan Priebe - Profihost AG wrote: >> Hi Qu, >> >> Am 13.03.2017 um 02:16 schrieb Qu Wenruo: >>> >>> At 03/13/2017 04:49 AM, Stefan Priebe - Profihost AG wrote: >>>> Hi Qu, >>>> >>>> while V5 was running fine against the openSUSE-42.2 kernel (based on >>>> v4.4). >>> >>> Thanks for the test. >>> >>>> V7 results in OOPS to me: >>>> BUG: unable to handle kernel NULL pointer dereference at >>>> 01f0 >>> >>> This 0x1f0 is the same as offsetof(struct brrfs_root, fs_info), quite >>> nice clue. >>> >>>> IP: [] __endio_write_update_ordered+0x33/0x140 >>>> [btrfs] >>> >>> IP points to: >>> --- >>> static inline bool btrfs_is_free_space_inode(struct btrfs_inode *inode) >>> { >>> struct btrfs_root *root = inode->root; << Either here >>> >>> if (root == root->fs_info->tree_root && << Or here >>> btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID) >>> >>> --- >>> >>> Taking the above offset into consideration, it's only possible for later >>> case. >>> >>> So here, we have a btrfs_inode whose @root is NULL. >> >> But wasn't this part of the code identical in V5? Why does it only >> happen with V7? > > There are still difference, but just as you said, the related > part(checking if inode is free space cache inode) is identical across v5 > and v7. But if i boot v7 it always happens. If i boot v5 it always works. Have done 5 repeatet tests. > I'm afraid that's a rare race leading to NULL btrfs_inode->root, which > could happen in both v5 and v7. > > What's the difference between SUSE and mainline kernel? A lot ;-) But i don't think anything related. > Maybe some mainline kernel commits have already fixed it? May be no idea. But i haven't found any reason why v5 works. Stefan > > Thanks, > Qu >> >>> This can be fixed easily by checking @root inside >>> btrfs_is_free_space_inode(), as the backtrace shows that it's only >>> happening for DirectIO, and it won't happen for free space cache inode. >>> >>> But I'm more curious how this happened for a more accurate fix, or we >>> could have other NULL pointer access. >>> >>> Did you have any reproducer for this? >> >> Sorry no - this is a production MariaDB Server running btrfs with >> compress-force=zlib. But if i could test anything i'll do. >> >> Greets, >> Stefan >> >>> >>> Thanks, >>> Qu >>> >>>> PGD 14e18d4067 PUD 14e1868067 PMD 0 >>>> Oops: [#1] SMP >>>> Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4 >>>> xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set >>>> nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq >>>> ata_generic >>>> virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci >>>> i2c_core usb_common ata_piix floppy >>>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1 >>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >>>> 1.7.5-20140722_172050-sagunt 04/01/2014 >>>> task: b4e0f500 ti: b4e0 task.ti: b4e0 >>>> RIP: 0010:[] [] >>>> __endio_write_update_ordered+0x33/0x140 [btrfs] >>>> RSP: 0018:8814eae03cd8 EFLAGS: 00010086 >>>> RAX: RBX: 8814e8fd5aa8 RCX: 0001 >>>> RDX: 0010 RSI: 0010 RDI: 8814e45885c0 >>>> RBP: 8814eae03d10 R08: 8814e8334000 R09: 00018040003a >>>> R10: ea00507d8d00 R11: 88141f634080 R12: 8814e45885c0 >>>> R13: 8814e125d700 R14: 0010 R15: 8800376c6a80 >>>> FS: () GS:8814eae0() >>>> knlGS: >>>> CS: 0010 DS: ES: CR0: 80050033 >>>> CR2: 01f0 CR3: 0014e34c9000 CR4: 001406f0Stack: >>>> 0010 8814e8fd5aa8 8814e953f3c0 >>>> 8814e125d700 0010 8800376c6a80 8814eae03d38 >>>> c03ddf67 8814e86b6a80 8814e8fd5aa8 0001 >>>> Call Trace: >>>> [] btrfs_endio_direct_write+0x37/0x60 [btrfs] >>>> [] bio_
Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error
Hi Qu, Am 13.03.2017 um 02:16 schrieb Qu Wenruo: > > At 03/13/2017 04:49 AM, Stefan Priebe - Profihost AG wrote: >> Hi Qu, >> >> while V5 was running fine against the openSUSE-42.2 kernel (based on >> v4.4). > > Thanks for the test. > >> V7 results in OOPS to me: >> BUG: unable to handle kernel NULL pointer dereference at 01f0 > > This 0x1f0 is the same as offsetof(struct brrfs_root, fs_info), quite > nice clue. > >> IP: [] __endio_write_update_ordered+0x33/0x140 [btrfs] > > IP points to: > --- > static inline bool btrfs_is_free_space_inode(struct btrfs_inode *inode) > { > struct btrfs_root *root = inode->root; << Either here > > if (root == root->fs_info->tree_root && << Or here > btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID) > > --- > > Taking the above offset into consideration, it's only possible for later > case. > > So here, we have a btrfs_inode whose @root is NULL. But wasn't this part of the code identical in V5? Why does it only happen with V7? > This can be fixed easily by checking @root inside > btrfs_is_free_space_inode(), as the backtrace shows that it's only > happening for DirectIO, and it won't happen for free space cache inode. > > But I'm more curious how this happened for a more accurate fix, or we > could have other NULL pointer access. > > Did you have any reproducer for this? Sorry no - this is a production MariaDB Server running btrfs with compress-force=zlib. But if i could test anything i'll do. Greets, Stefan > > Thanks, > Qu > >> PGD 14e18d4067 PUD 14e1868067 PMD 0 >> Oops: [#1] SMP >> Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4 >> xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set >> nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq ata_generic >> virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci >> i2c_core usb_common ata_piix floppy >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> 1.7.5-20140722_172050-sagunt 04/01/2014 >> task: b4e0f500 ti: b4e0 task.ti: b4e0 >> RIP: 0010:[] [] >> __endio_write_update_ordered+0x33/0x140 [btrfs] >> RSP: 0018:8814eae03cd8 EFLAGS: 00010086 >> RAX: RBX: 8814e8fd5aa8 RCX: 0001 >> RDX: 0010 RSI: 0010 RDI: 8814e45885c0 >> RBP: 8814eae03d10 R08: 8814e8334000 R09: 00018040003a >> R10: ea00507d8d00 R11: 88141f634080 R12: 8814e45885c0 >> R13: 8814e125d700 R14: 0010 R15: 8800376c6a80 >> FS: () GS:8814eae0() >> knlGS: >> CS: 0010 DS: ES: CR0: 80050033 >> CR2: 01f0 CR3: 0014e34c9000 CR4: 001406f0Stack: >> 0010 8814e8fd5aa8 8814e953f3c0 >> 8814e125d700 0010 8800376c6a80 8814eae03d38 >> c03ddf67 8814e86b6a80 8814e8fd5aa8 0001 >> Call Trace: >> [] btrfs_endio_direct_write+0x37/0x60 [btrfs] >> [] bio_endio+0x57/0x60 >> [] btrfs_end_bio+0xa1/0x140 [btrfs] >> [] bio_endio+0x57/0x60 >> [] blk_update_request+0x8b/0x330 >> [] blk_mq_end_request+0x1a/0x70 >> [] virtblk_request_done+0x3f/0x70 [virtio_blk] >> [] __blk_mq_complete_request+0x78/0xe0 >> [] blk_mq_complete_request+0x1c/0x20 >> [] virtblk_done+0x64/0xe0 [virtio_blk] >> [] vring_interrupt+0x3a/0x90 >> [] __handle_irq_event_percpu+0x89/0x1b0 >> [] handle_irq_event_percpu+0x23/0x60 >> [] handle_irq_event+0x3b/0x60 >> [] handle_edge_irq+0x6f/0x150 >> [] handle_irq+0x1d/0x30 >> [] do_IRQ+0x4b/0xd0 >> [] common_interrupt+0x8c/0x8c >> DWARF2 unwinder stuck at ret_from_intr+0x0/0x1b >> Leftover inexact backtrace: >> 2017-03-12 20:33:08 >> 2017-03-12 20:33:08 [] ? native_safe_halt+0x6/0x10 >> [] default_idle+0x1e/0xe0 >> [] arch_cpu_idle+0xf/0x20 >> [] default_idle_call+0x3b/0x40 >> [] cpu_startup_entry+0x29a/0x370 >> [] rest_init+0x7c/0x80 >> [] start_kernel+0x490/0x49d >> [] ? early_idt_handler_array+0x120/0x120 >> [] x86_64_start_reservations+0x2a/0x2c >> [] x86_64_start_kernel+0x13b/0x14a >> Code: e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 48 8b 87 70 fc >> ff ff 4c 8b 87 38 fe ff ff 48 c7 45 c8 00 00 00 00 48 89 75 d0 <48> 8b >> b8 f0 01 00 00 48 3b 47 28 49 8b 84 24 78 fc ff ff 0f 84 >> RIP [] __endio_write_upd
[PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error
Hi Qu, while V5 was running fine against the openSUSE-42.2 kernel (based on v4.4). V7 results in OOPS to me: BUG: unable to handle kernel NULL pointer dereference at 01f0 IP: [] __endio_write_update_ordered+0x33/0x140 [btrfs] PGD 14e18d4067 PUD 14e1868067 PMD 0 Oops: [#1] SMP Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4 xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq ata_generic virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci i2c_core usb_common ata_piix floppy CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140722_172050-sagunt 04/01/2014 task: b4e0f500 ti: b4e0 task.ti: b4e0 RIP: 0010:[] [] __endio_write_update_ordered+0x33/0x140 [btrfs] RSP: 0018:8814eae03cd8 EFLAGS: 00010086 RAX: RBX: 8814e8fd5aa8 RCX: 0001 RDX: 0010 RSI: 0010 RDI: 8814e45885c0 RBP: 8814eae03d10 R08: 8814e8334000 R09: 00018040003a R10: ea00507d8d00 R11: 88141f634080 R12: 8814e45885c0 R13: 8814e125d700 R14: 0010 R15: 8800376c6a80 FS: () GS:8814eae0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 01f0 CR3: 0014e34c9000 CR4: 001406f0Stack: 0010 8814e8fd5aa8 8814e953f3c0 8814e125d700 0010 8800376c6a80 8814eae03d38 c03ddf67 8814e86b6a80 8814e8fd5aa8 0001 Call Trace: [] btrfs_endio_direct_write+0x37/0x60 [btrfs] [] bio_endio+0x57/0x60 [] btrfs_end_bio+0xa1/0x140 [btrfs] [] bio_endio+0x57/0x60 [] blk_update_request+0x8b/0x330 [] blk_mq_end_request+0x1a/0x70 [] virtblk_request_done+0x3f/0x70 [virtio_blk] [] __blk_mq_complete_request+0x78/0xe0 [] blk_mq_complete_request+0x1c/0x20 [] virtblk_done+0x64/0xe0 [virtio_blk] [] vring_interrupt+0x3a/0x90 [] __handle_irq_event_percpu+0x89/0x1b0 [] handle_irq_event_percpu+0x23/0x60 [] handle_irq_event+0x3b/0x60 [] handle_edge_irq+0x6f/0x150 [] handle_irq+0x1d/0x30 [] do_IRQ+0x4b/0xd0 [] common_interrupt+0x8c/0x8c DWARF2 unwinder stuck at ret_from_intr+0x0/0x1b Leftover inexact backtrace: 2017-03-12 20:33:08 2017-03-12 20:33:08 [] ? native_safe_halt+0x6/0x10 [] default_idle+0x1e/0xe0 [] arch_cpu_idle+0xf/0x20 [] default_idle_call+0x3b/0x40 [] cpu_startup_entry+0x29a/0x370 [] rest_init+0x7c/0x80 [] start_kernel+0x490/0x49d [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x13b/0x14a Code: e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 48 8b 87 70 fc ff ff 4c 8b 87 38 fe ff ff 48 c7 45 c8 00 00 00 00 48 89 75 d0 <48> 8b b8 f0 01 00 00 48 3b 47 28 49 8b 84 24 78 fc ff ff 0f 84 RIP [] __endio_write_update_ordered+0x33/0x140 [btrfs] RSP CR2: 01f0 ---[ end trace 7529a0652fd7873e ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x3300 from 0x8100 (relocation range: 0x8000-0xbfff) Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues
Hi, can please anybody comment on that one? Josef? Chris? I still need those patches to be able to let btrfs run for more than 24hours without ENOSPC issues. Greets, Stefan Am 27.02.2017 um 08:22 schrieb Qu Wenruo: > > > At 02/25/2017 04:23 PM, Stefan Priebe - Profihost AG wrote: >> Dear Qu, >> >> any news on your branch? I still don't see it merged anywhere. >> >> Greets, >> Stefan > > I just remember that Liu Bo has commented one of the patches, I'm afraid > I can only push these patches until I addressed his concern. > > I'll start digging it as memory for this fix is quite blurred now. > > Thanks, > Qu >> >> Am 04.01.2017 um 17:13 schrieb Stefan Priebe - Profihost AG: >>> Hi Qu, >>> >>> Am 01.01.2017 um 10:32 schrieb Qu Wenruo: >>>> Hi Stefan, >>>> >>>> I'm trying to push it to for-next (will be v4.11), but no response yet. >>>> >>>> It would be quite nice for you to test the following git pull and give >>>> some feedback, so that we can merge it faster. >>>> >>>> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html >>> >>> I'm also getting a notifier that wang's email does not exist anymore >>> (wangxg.f...@cn.fujitsu.com). >>> >>> I would like to test that branch will need some time todo so. Last time >>> i tried 4.10-rc1 i had the same problems like this guy: >>> https://www.marc.info/?l=linux-btrfs=148338312525137=2 >>> >>> Stefan >>> >>>> Thanks, >>>> Qu >>>> >>>> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote: >>>>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc >>>>> >>>>> Stefan >>>>> >>>>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang: >>>>>> When having compression enabled, Stefan Priebe ofen got enospc errors >>>>>> though fs still has much free space. Qu Wenruo also has submitted a >>>>>> fstests test case which can reproduce this bug steadily, please see >>>>>> url: https://patchwork.kernel.org/patch/9420527 >>>>>> >>>>>> First patch[1/3] "btrfs: improve inode's outstanding_extents >>>>>> computation" is to >>>>>> fix outstanding_extents and reserved_extents account issues. This >>>>>> issue was revealed >>>>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying >>>>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these >>>>>> warnings from >>>>>> btrfs_destroy_inode(): >>>>>> WARN_ON(BTRFS_I(inode)->outstanding_extents); >>>>>> WARN_ON(BTRFS_I(inode)->reserved_extents); >>>>>> Please see this patch's commit message for detailed info, and this >>>>>> patch is >>>>>> necessary to patch2 and patch3. >>>>>> >>>>>> For false enospc, the root reasson is that for compression, its max >>>>>> extent size will >>>>>> be 128k, not 128MB. If we still use 128MB as max extent size to >>>>>> reserve metadata for >>>>>> compression, obviously it's not appropriate. In patch "btrfs: >>>>>> Introduce COMPRESS >>>>>> reserve type to fix false enospc for compression" commit message, >>>>>> we explain why false enospc error occurs, please see it for detailed >>>>>> info. >>>>>> >>>>>> To fix this issue, we introduce a new enum type: >>>>>> enum btrfs_metadata_reserve_type { >>>>>> BTRFS_RESERVE_NORMAL, >>>>>> BTRFS_RESERVE_COMPRESS, >>>>>> }; >>>>>> For btrfs_delalloc_[reserve|release]_metadata() and >>>>>> btrfs_delalloc_[reserve|release]_space(), we introce a new >>>>>> btrfs_metadata_reserve_type >>>>>> argument, then if a path needs to go compression, we pass >>>>>> BTRFS_RESERVE_COMPRESS, >>>>>> otherwise pass BTRFS_RESERVE_NORMAL. >>>>>> >>>>>> With these patchs, Stefan no longer saw such false enospc errors, and >>>>>> Qu Wenruo's >>>>>> fstests test case will also pass. I have also run whole fstests >>>>>> multiple times, >>>>>>
Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues
Dear Qu, any news on your branch? I still don't see it merged anywhere. Greets, Stefan Am 04.01.2017 um 17:13 schrieb Stefan Priebe - Profihost AG: > Hi Qu, > > Am 01.01.2017 um 10:32 schrieb Qu Wenruo: >> Hi Stefan, >> >> I'm trying to push it to for-next (will be v4.11), but no response yet. >> >> It would be quite nice for you to test the following git pull and give >> some feedback, so that we can merge it faster. >> >> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html > > I'm also getting a notifier that wang's email does not exist anymore > (wangxg.f...@cn.fujitsu.com). > > I would like to test that branch will need some time todo so. Last time > i tried 4.10-rc1 i had the same problems like this guy: > https://www.marc.info/?l=linux-btrfs=148338312525137=2 > > Stefan > >> Thanks, >> Qu >> >> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote: >>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc >>> >>> Stefan >>> >>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang: >>>> When having compression enabled, Stefan Priebe ofen got enospc errors >>>> though fs still has much free space. Qu Wenruo also has submitted a >>>> fstests test case which can reproduce this bug steadily, please see >>>> url: https://patchwork.kernel.org/patch/9420527 >>>> >>>> First patch[1/3] "btrfs: improve inode's outstanding_extents >>>> computation" is to >>>> fix outstanding_extents and reserved_extents account issues. This >>>> issue was revealed >>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying >>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these >>>> warnings from >>>> btrfs_destroy_inode(): >>>> WARN_ON(BTRFS_I(inode)->outstanding_extents); >>>> WARN_ON(BTRFS_I(inode)->reserved_extents); >>>> Please see this patch's commit message for detailed info, and this >>>> patch is >>>> necessary to patch2 and patch3. >>>> >>>> For false enospc, the root reasson is that for compression, its max >>>> extent size will >>>> be 128k, not 128MB. If we still use 128MB as max extent size to >>>> reserve metadata for >>>> compression, obviously it's not appropriate. In patch "btrfs: >>>> Introduce COMPRESS >>>> reserve type to fix false enospc for compression" commit message, >>>> we explain why false enospc error occurs, please see it for detailed >>>> info. >>>> >>>> To fix this issue, we introduce a new enum type: >>>> enum btrfs_metadata_reserve_type { >>>> BTRFS_RESERVE_NORMAL, >>>> BTRFS_RESERVE_COMPRESS, >>>> }; >>>> For btrfs_delalloc_[reserve|release]_metadata() and >>>> btrfs_delalloc_[reserve|release]_space(), we introce a new >>>> btrfs_metadata_reserve_type >>>> argument, then if a path needs to go compression, we pass >>>> BTRFS_RESERVE_COMPRESS, >>>> otherwise pass BTRFS_RESERVE_NORMAL. >>>> >>>> With these patchs, Stefan no longer saw such false enospc errors, and >>>> Qu Wenruo's >>>> fstests test case will also pass. I have also run whole fstests >>>> multiple times, >>>> no regression occurs, thanks. >>>> >>>> Wang Xiaoguang (3): >>>> btrfs: improve inode's outstanding_extents computation >>>> btrfs: introduce type based delalloc metadata reserve >>>> btrfs: Introduce COMPRESS reserve type to fix false enospc for >>>> compression >>>> >>>> fs/btrfs/ctree.h | 36 +-- >>>> fs/btrfs/extent-tree.c | 52 ++--- >>>> fs/btrfs/extent_io.c | 61 ++- >>>> fs/btrfs/extent_io.h | 5 + >>>> fs/btrfs/file.c | 25 +++-- >>>> fs/btrfs/free-space-cache.c | 6 +- >>>> fs/btrfs/inode-map.c | 6 +- >>>> fs/btrfs/inode.c | 246 >>>> ++- >>>> fs/btrfs/ioctl.c | 16 +-- >>>> fs/btrfs/relocation.c| 14 ++- >>>> fs/btrfs/tests/inode-tests.c | 15 +-- >>>> 11 files changed, 381 insertions(+), 101 deletions(-) >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
high cpu usage due to btrfs_find_space_for_alloc and rb_next
Hi, is there any chance to optimize btrfs_find_space_for_alloc / rb_next on big devices? I've plenty of free space but most of the time there's only low I/O but high cpu usage. perf top shows: 60,41% [kernel] [k] rb_next 9,74% [kernel] [k] btrfs_find_space_for_alloc 5,55% [kernel] [k] tree_search_offset.isra.25 # btrfs filesystem df /backup/ Data, single: total=14.85TiB, used=14.37TiB System, single: total=32.00MiB, used=2.27MiB Metadata, single: total=63.00GiB, used=54.87GiB GlobalReserve, single: total=512.00MiB, used=80.17MiB -- Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues
Hi Qu, Am 01.01.2017 um 10:32 schrieb Qu Wenruo: > Hi Stefan, > > I'm trying to push it to for-next (will be v4.11), but no response yet. > > It would be quite nice for you to test the following git pull and give > some feedback, so that we can merge it faster. > > https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html I'm also getting a notifier that wang's email does not exist anymore (wangxg.f...@cn.fujitsu.com). I would like to test that branch will need some time todo so. Last time i tried 4.10-rc1 i had the same problems like this guy: https://www.marc.info/?l=linux-btrfs=148338312525137=2 Stefan > Thanks, > Qu > > On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote: >> Any news on this series? I can't see it in 4.9 nor in 4.10-rc >> >> Stefan >> >> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang: >>> When having compression enabled, Stefan Priebe ofen got enospc errors >>> though fs still has much free space. Qu Wenruo also has submitted a >>> fstests test case which can reproduce this bug steadily, please see >>> url: https://patchwork.kernel.org/patch/9420527 >>> >>> First patch[1/3] "btrfs: improve inode's outstanding_extents >>> computation" is to >>> fix outstanding_extents and reserved_extents account issues. This >>> issue was revealed >>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying >>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these >>> warnings from >>> btrfs_destroy_inode(): >>> WARN_ON(BTRFS_I(inode)->outstanding_extents); >>> WARN_ON(BTRFS_I(inode)->reserved_extents); >>> Please see this patch's commit message for detailed info, and this >>> patch is >>> necessary to patch2 and patch3. >>> >>> For false enospc, the root reasson is that for compression, its max >>> extent size will >>> be 128k, not 128MB. If we still use 128MB as max extent size to >>> reserve metadata for >>> compression, obviously it's not appropriate. In patch "btrfs: >>> Introduce COMPRESS >>> reserve type to fix false enospc for compression" commit message, >>> we explain why false enospc error occurs, please see it for detailed >>> info. >>> >>> To fix this issue, we introduce a new enum type: >>> enum btrfs_metadata_reserve_type { >>> BTRFS_RESERVE_NORMAL, >>> BTRFS_RESERVE_COMPRESS, >>> }; >>> For btrfs_delalloc_[reserve|release]_metadata() and >>> btrfs_delalloc_[reserve|release]_space(), we introce a new >>> btrfs_metadata_reserve_type >>> argument, then if a path needs to go compression, we pass >>> BTRFS_RESERVE_COMPRESS, >>> otherwise pass BTRFS_RESERVE_NORMAL. >>> >>> With these patchs, Stefan no longer saw such false enospc errors, and >>> Qu Wenruo's >>> fstests test case will also pass. I have also run whole fstests >>> multiple times, >>> no regression occurs, thanks. >>> >>> Wang Xiaoguang (3): >>> btrfs: improve inode's outstanding_extents computation >>> btrfs: introduce type based delalloc metadata reserve >>> btrfs: Introduce COMPRESS reserve type to fix false enospc for >>> compression >>> >>> fs/btrfs/ctree.h | 36 +-- >>> fs/btrfs/extent-tree.c | 52 ++--- >>> fs/btrfs/extent_io.c | 61 ++- >>> fs/btrfs/extent_io.h | 5 + >>> fs/btrfs/file.c | 25 +++-- >>> fs/btrfs/free-space-cache.c | 6 +- >>> fs/btrfs/inode-map.c | 6 +- >>> fs/btrfs/inode.c | 246 >>> ++- >>> fs/btrfs/ioctl.c | 16 +-- >>> fs/btrfs/relocation.c| 14 ++- >>> fs/btrfs/tests/inode-tests.c | 15 +-- >>> 11 files changed, 381 insertions(+), 101 deletions(-) >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues
Any news on this series? I can't see it in 4.9 nor in 4.10-rc Stefan Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang: > When having compression enabled, Stefan Priebe ofen got enospc errors > though fs still has much free space. Qu Wenruo also has submitted a > fstests test case which can reproduce this bug steadily, please see > url: https://patchwork.kernel.org/patch/9420527 > > First patch[1/3] "btrfs: improve inode's outstanding_extents computation" is > to > fix outstanding_extents and reserved_extents account issues. This issue was > revealed > by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying > BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these warnings > from > btrfs_destroy_inode(): > WARN_ON(BTRFS_I(inode)->outstanding_extents); > WARN_ON(BTRFS_I(inode)->reserved_extents); > Please see this patch's commit message for detailed info, and this patch is > necessary to patch2 and patch3. > > For false enospc, the root reasson is that for compression, its max extent > size will > be 128k, not 128MB. If we still use 128MB as max extent size to reserve > metadata for > compression, obviously it's not appropriate. In patch "btrfs: Introduce > COMPRESS > reserve type to fix false enospc for compression" commit message, > we explain why false enospc error occurs, please see it for detailed info. > > To fix this issue, we introduce a new enum type: > enum btrfs_metadata_reserve_type { > BTRFS_RESERVE_NORMAL, > BTRFS_RESERVE_COMPRESS, > }; > For btrfs_delalloc_[reserve|release]_metadata() and > btrfs_delalloc_[reserve|release]_space(), we introce a new > btrfs_metadata_reserve_type > argument, then if a path needs to go compression, we pass > BTRFS_RESERVE_COMPRESS, > otherwise pass BTRFS_RESERVE_NORMAL. > > With these patchs, Stefan no longer saw such false enospc errors, and Qu > Wenruo's > fstests test case will also pass. I have also run whole fstests multiple > times, > no regression occurs, thanks. > > Wang Xiaoguang (3): > btrfs: improve inode's outstanding_extents computation > btrfs: introduce type based delalloc metadata reserve > btrfs: Introduce COMPRESS reserve type to fix false enospc for > compression > > fs/btrfs/ctree.h | 36 +-- > fs/btrfs/extent-tree.c | 52 ++--- > fs/btrfs/extent_io.c | 61 ++- > fs/btrfs/extent_io.h | 5 + > fs/btrfs/file.c | 25 +++-- > fs/btrfs/free-space-cache.c | 6 +- > fs/btrfs/inode-map.c | 6 +- > fs/btrfs/inode.c | 246 > ++- > fs/btrfs/ioctl.c | 16 +-- > fs/btrfs/relocation.c| 14 ++- > fs/btrfs/tests/inode-tests.c | 15 +-- > 11 files changed, 381 insertions(+), 101 deletions(-) > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Metadata balance fails ENOSPC
isn't there a way to move free space to unallocated space again? Am 03.12.2016 um 05:43 schrieb Andrei Borzenkov: > 01.12.2016 18:48, Chris Murphy пишет: >> On Thu, Dec 1, 2016 at 7:10 AM, Stefan Priebe - Profihost AG >> <s.pri...@profihost.ag> wrote: >>> >>> Am 01.12.2016 um 14:51 schrieb Hans van Kranenburg: >>>> On 12/01/2016 09:12 AM, Andrei Borzenkov wrote: >>>>> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG >>>>> <s.pri...@profihost.ag> wrote: >>>>> ... >>>>>> >>>>>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7 >>>>>> which does the same. >>>>>> >>>>>> >>>>>>>> # btrfs filesystem show /ssddisk/ >>>>>>>> Label: none uuid: a69d2e90-c2ca-4589-9876-234446868adc >>>>>>>> Total devices 1 FS bytes used 305.67GiB >>>>>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1 >>>>>>>> >>>>>>>> # btrfs filesystem usage /ssddisk/ >>>>>>>> Overall: >>>>>>>> Device size: 500.00GiB >>>>>>>> Device allocated:500.00GiB >>>>>>>> Device unallocated:1.05MiB >>>>>>> >>>>>>> Drive is actually fully allocated so if Btrfs needs to create a new >>>>>>> chunk right now, it can't. However, >>>>>> >>>>>> Yes but there's lot of free space: >>>>>> Free (estimated):193.46GiB (min: 193.46GiB) >>>>>> >>>>>> How does this match? >>>>>> >>>>>> >>>>>>> All three chunk types have quite a bit of unused space in them, so >>>>>>> it's unclear why there's a no space left error. >>>>>>> >>>>> >>>>> I remember discussion that balance always tries to pre-allocate one >>>>> chunk in advance, and I believe there was patch to correct it but I am >>>>> not sure whether it was merged. >>>> >>>> http://www.spinics.net/lists/linux-btrfs/msg56772.html >>> >>> Thanks - still don't understand why that one is not upstream or why it >>> was reverted. Looks absolutely reasonable to me. >> >> It is upstream and hasn't been reverted. >> >> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/fs/btrfs/volumes.c?id=refs/tags/v4.8.11 >> line 3650 >> >> I would try Duncan's idea of using just one filter and seeing what happens: >> >> 'btrfs balance start -dusage=1 ' >> > > Actually I just hit exactly the same symptoms on my VM where device was > fully allocated and metadata balance failed, but data balance succeeded > to free up space which allowed metadata balance to run too. This is > under 4.8.10. > > So it appears that balance logic between data and metadata is somehow > different. > > As this VM gets in 100% allocated condition fairly often I'd try to get > better understanding next time. > > >> >>>>>> With enospc debug it says: >>>>>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new >>>>>> chunk for block group 839941881856 >>>>>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance >> >> It might be nice if this stated what kind of chunk it's trying to allocate. >> >> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Metadata balance fails ENOSPC
Am 01.12.2016 um 16:48 schrieb Chris Murphy: > On Thu, Dec 1, 2016 at 7:10 AM, Stefan Priebe - Profihost AG > <s.pri...@profihost.ag> wrote: >> >> Am 01.12.2016 um 14:51 schrieb Hans van Kranenburg: >>> On 12/01/2016 09:12 AM, Andrei Borzenkov wrote: >>>> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG >>>> <s.pri...@profihost.ag> wrote: >>>> ... >>>>> >>>>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7 >>>>> which does the same. >>>>> >>>>> >>>>>>> # btrfs filesystem show /ssddisk/ >>>>>>> Label: none uuid: a69d2e90-c2ca-4589-9876-234446868adc >>>>>>> Total devices 1 FS bytes used 305.67GiB >>>>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1 >>>>>>> >>>>>>> # btrfs filesystem usage /ssddisk/ >>>>>>> Overall: >>>>>>> Device size: 500.00GiB >>>>>>> Device allocated:500.00GiB >>>>>>> Device unallocated:1.05MiB >>>>>> >>>>>> Drive is actually fully allocated so if Btrfs needs to create a new >>>>>> chunk right now, it can't. However, >>>>> >>>>> Yes but there's lot of free space: >>>>> Free (estimated):193.46GiB (min: 193.46GiB) >>>>> >>>>> How does this match? >>>>> >>>>> >>>>>> All three chunk types have quite a bit of unused space in them, so >>>>>> it's unclear why there's a no space left error. >>>>>> >>>> >>>> I remember discussion that balance always tries to pre-allocate one >>>> chunk in advance, and I believe there was patch to correct it but I am >>>> not sure whether it was merged. >>> >>> http://www.spinics.net/lists/linux-btrfs/msg56772.html >> >> Thanks - still don't understand why that one is not upstream or why it >> was reverted. Looks absolutely reasonable to me. > > It is upstream and hasn't been reverted. > > https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/fs/btrfs/volumes.c?id=refs/tags/v4.8.11 > line 3650 > > I would try Duncan's idea of using just one filter and seeing what happens: > > 'btrfs balance start -dusage=1 ' see below: [zabbix-db ~]# btrfs balance start -dusage=1 /ssddisk/ Done, had to relocate 0 out of 505 chunks [zabbix-db ~]# btrfs balance start -dusage=10 /ssddisk/ Done, had to relocate 0 out of 505 chunks [zabbix-db ~]# btrfs balance start -musage=1 /ssddisk/ ERROR: error during balancing '/ssddisk/': No space left on device There may be more info in syslog - try dmesg | tail [zabbix-db ~]# dmesg [78306.288834] BTRFS warning (device vdb1): no space to allocate a new chunk for block group 839941881856 [78306.289197] BTRFS info (device vdb1): 1 enospc errors during balance > > >>>>> With enospc debug it says: >>>>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new >>>>> chunk for block group 839941881856 >>>>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance > > It might be nice if this stated what kind of chunk it's trying to allocate. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Metadata balance fails ENOSPC
Am 01.12.2016 um 14:51 schrieb Hans van Kranenburg: > On 12/01/2016 09:12 AM, Andrei Borzenkov wrote: >> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG >> <s.pri...@profihost.ag> wrote: >> ... >>> >>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7 >>> which does the same. >>> >>> >>>>> # btrfs filesystem show /ssddisk/ >>>>> Label: none uuid: a69d2e90-c2ca-4589-9876-234446868adc >>>>> Total devices 1 FS bytes used 305.67GiB >>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1 >>>>> >>>>> # btrfs filesystem usage /ssddisk/ >>>>> Overall: >>>>> Device size: 500.00GiB >>>>> Device allocated:500.00GiB >>>>> Device unallocated:1.05MiB >>>> >>>> Drive is actually fully allocated so if Btrfs needs to create a new >>>> chunk right now, it can't. However, >>> >>> Yes but there's lot of free space: >>> Free (estimated):193.46GiB (min: 193.46GiB) >>> >>> How does this match? >>> >>> >>>> All three chunk types have quite a bit of unused space in them, so >>>> it's unclear why there's a no space left error. >>>> >> >> I remember discussion that balance always tries to pre-allocate one >> chunk in advance, and I believe there was patch to correct it but I am >> not sure whether it was merged. > > http://www.spinics.net/lists/linux-btrfs/msg56772.html Thanks - still don't understand why that one is not upstream or why it was reverted. Looks absolutely reasonable to me. Other option would be to make it possible to make allocated unused space unallocted again - no idea how todo that. > >>>> Try remounting with enoscp_debug, and then trigger the problem again, >>>> and post the resulting kernel messages. >>> >>> With enospc debug it says: >>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new >>> chunk for block group 839941881856 >>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Metadata balance fails ENOSPC
Am 01.12.2016 um 09:12 schrieb Andrei Borzenkov: > On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG > <s.pri...@profihost.ag> wrote: > ... >> >> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7 >> which does the same. >> >> >>>> # btrfs filesystem show /ssddisk/ >>>> Label: none uuid: a69d2e90-c2ca-4589-9876-234446868adc >>>> Total devices 1 FS bytes used 305.67GiB >>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1 >>>> >>>> # btrfs filesystem usage /ssddisk/ >>>> Overall: >>>> Device size: 500.00GiB >>>> Device allocated:500.00GiB >>>> Device unallocated:1.05MiB >>> >>> Drive is actually fully allocated so if Btrfs needs to create a new >>> chunk right now, it can't. However, >> >> Yes but there's lot of free space: >> Free (estimated):193.46GiB (min: 193.46GiB) >> >> How does this match? >> >> >>> All three chunk types have quite a bit of unused space in them, so >>> it's unclear why there's a no space left error. >>> > > I remember discussion that balance always tries to pre-allocate one > chunk in advance, and I believe there was patch to correct it but I am > not sure whether it was merged. Is there otherwise a possibility to make the free space unallocated again? Stefan > >>> Try remounting with enoscp_debug, and then trigger the problem again, >>> and post the resulting kernel messages. >> >> With enospc debug it says: >> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new >> chunk for block group 839941881856 >> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance >> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Metadata balance fails ENOSPC
Am 01.12.2016 um 00:02 schrieb Chris Murphy: > On Wed, Nov 30, 2016 at 2:03 PM, Stefan Priebe - Profihost AG > <s.pri...@profihost.ag> wrote: >> Hello, >> >> # btrfs balance start -v -dusage=0 -musage=1 /ssddisk/ >> Dumping filters: flags 0x7, state 0x0, force is off >> DATA (flags 0x2): balancing, usage=0 >> METADATA (flags 0x2): balancing, usage=1 >> SYSTEM (flags 0x2): balancing, usage=1 >> ERROR: error during balancing '/ssddisk/': No space left on device >> There may be more info in syslog - try dmesg | tail > > You haven't provided kernel messages at the time of the error. Kernel Message: [ 429.107723] BTRFS info (device vdb1): 1 enospc errors during balance > Also useful is the kernel version. Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7 which does the same. >> # btrfs filesystem show /ssddisk/ >> Label: none uuid: a69d2e90-c2ca-4589-9876-234446868adc >> Total devices 1 FS bytes used 305.67GiB >> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1 >> >> # btrfs filesystem usage /ssddisk/ >> Overall: >> Device size: 500.00GiB >> Device allocated:500.00GiB >> Device unallocated:1.05MiB > > Drive is actually fully allocated so if Btrfs needs to create a new > chunk right now, it can't. However, Yes but there's lot of free space: Free (estimated):193.46GiB (min: 193.46GiB) How does this match? > All three chunk types have quite a bit of unused space in them, so > it's unclear why there's a no space left error. > > Try remounting with enoscp_debug, and then trigger the problem again, > and post the resulting kernel messages. With enospc debug it says: [39193.425682] BTRFS warning (device vdb1): no space to allocate a new chunk for block group 839941881856 [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Metadata balance fails ENOSPC
Hello, # btrfs balance start -v -dusage=0 -musage=1 /ssddisk/ Dumping filters: flags 0x7, state 0x0, force is off DATA (flags 0x2): balancing, usage=0 METADATA (flags 0x2): balancing, usage=1 SYSTEM (flags 0x2): balancing, usage=1 ERROR: error during balancing '/ssddisk/': No space left on device There may be more info in syslog - try dmesg | tail # btrfs filesystem show /ssddisk/ Label: none uuid: a69d2e90-c2ca-4589-9876-234446868adc Total devices 1 FS bytes used 305.67GiB devid1 size 500.00GiB used 500.00GiB path /dev/vdb1 # btrfs filesystem usage /ssddisk/ Overall: Device size: 500.00GiB Device allocated:500.00GiB Device unallocated:1.05MiB Device missing: 0.00B Used:305.69GiB Free (estimated):185.78GiB (min: 185.78GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 608.00KiB) Data,single: Size:483.97GiB, Used:298.18GiB /dev/vdb1 483.97GiB Metadata,single: Size:16.00GiB, Used:7.51GiB /dev/vdb1 16.00GiB System,single: Size:32.00MiB, Used:144.00KiB /dev/vdb1 32.00MiB Unallocated: /dev/vdb1 1.05MiB How can i make it balancing again? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: resend: Re: Btrfs: adjust len of writes if following a preallocated extent
Am 23.11.2016 um 19:23 schrieb Holger Hoffstätte: > On 11/23/16 18:21, Stefan Priebe - Profihost AG wrote: >> Am 04.11.2016 um 20:20 schrieb Liu Bo: >>> If we have >>> >>> |0--hole--4095||4096--preallocate--12287| >>> >>> instead of using preallocated space, a 8K direct write will just >>> create a new 8K extent and it'll end up with >>> >>> |0--new extent--8191||8192--preallocate--12287| >>> >>> It's because we find a hole em and then go to create a new 8K >>> extent directly without adjusting @len. >> >> after applying that one on top of my 4.4 btrfs branch (includes patches >> up to 4.10 / next). i'm getting deadlocks in btrfs. > > *ctrl+f sectorsize* .. > > That's not surprising if you did what I suspect. If your tree is based > on my - now really very retired - 4.4.x queue, then you are likely missing > _all the other blocksize/sectorsize patches_ that came in from Chandra > Seetharaman et al., which I _really_ carefully patched around, for many > good reasons. *arg* that makes sense. Still not easy to find out which ones to skip. Yes that one is based on yours. thanks, Stefan > > -h > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
resend: Re: Btrfs: adjust len of writes if following a preallocated extent
Hi, sorry last mail was from the wrong box. Am 04.11.2016 um 20:20 schrieb Liu Bo: > If we have > > |0--hole--4095||4096--preallocate--12287| > > instead of using preallocated space, a 8K direct write will just > create a new 8K extent and it'll end up with > > |0--new extent--8191||8192--preallocate--12287| > > It's because we find a hole em and then go to create a new 8K > extent directly without adjusting @len. after applying that one on top of my 4.4 btrfs branch (includes patches up to 4.10 / next). i'm getting deadlocks in btrfs. Traces here: INFO: task btrfs-transacti:604 blocked for more than 120 seconds. Not tainted 4.4.34 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. btrfs-transacti D 8814e78cbe00 0 604 2 0x0008 8814e78cbe00 88017367a540 8814e2f88000 8814e78cc000 8814e78cbe38 88123616c510 8814e24c81f0 88153fb0a000 8814e78cbe18 816a8425 8814e63165a0 8814e78cbe88 Call Trace: [] schedule+0x35/0x80 [] btrfs_commit_transaction+0x275/0xa50 [btrfs] [] transaction_kthread+0x1d6/0x200 [btrfs] [] kthread+0xdb/0x100 [] ret_from_fork+0x3f/0x70 DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 Leftover inexact backtrace: [] ? kthread_park+0x60/0x60 INFO: task mysqld:1977 blocked for more than 120 seconds. Not tainted 4.4.34 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 88142ef1bcf8 0 1977 1 0x0008 88142ef1bcf8 81e0f500 8814dc2c4a80 88142ef1c000 8814e32ed298 8814e32ed2c0 88110aa9a000 8814e32ed000 88142ef1bd10 816a8425 8814e32ed000 88142ef1bd60 Call Trace: [] schedule+0x35/0x80 [] wait_for_writer+0xa2/0xb0 [btrfs] [] btrfs_sync_log+0xe9/0xa00 [btrfs] [] btrfs_sync_file+0x35f/0x3d0 [btrfs] [] vfs_fsync_range+0x3d/0xb0 [] do_fsync+0x3d/0x70 [] SyS_fsync+0x10/0x20 [] entry_SYSCALL_64_fastpath+0x12/0x71 DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 Leftover inexact backtrace: INFO: task mysqld:3249 blocked for more than 120 seconds. Not tainted 4.4.34 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 881475fdfa40 0 3249 1 0x0008 881475fdfa40 88017367ca80 8814433d2540 881475fe 88040da39ba0 0023 88040da39c20 00238000 881475fdfa58 816a8425 8000 881475fdfb18 Call Trace: [] schedule+0x35/0x80 [] wait_ordered_extents.isra.18.constprop.23+0x147/0x3d0 [btrfs] [] btrfs_log_changed_extents+0x242/0x610 [btrfs] [] btrfs_log_inode+0x874/0xb80 [btrfs] [] btrfs_log_inode_parent+0x22c/0x910 [btrfs] [] btrfs_log_dentry_safe+0x62/0x80 [btrfs] [] btrfs_sync_file+0x28c/0x3d0 [btrfs] [] vfs_fsync_range+0x3d/0xb0 [] do_fsync+0x3d/0x70 [] SyS_fsync+0x10/0x20 [] entry_SYSCALL_64_fastpath+0x12/0x71 DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 Leftover inexact backtrace: INFO: task mysqld:3250 blocked for more than 120 seconds. Not tainted 4.4.34 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 881374edb868 0 3250 1 0x0008 881374edb868 8801736b2540 8814433d4a80 881374edc000 8814e26f81c8 8814e26f81e0 00238000 000a8000 881374edb880 816a8425 8814433d4a80 881374edb8d8 Call Trace: [] schedule+0x35/0x80 [] rwsem_down_read_failed+0xed/0x130 [] call_rwsem_down_read_failed+0x14/0x30 DWARF2 unwinder stuck at call_rwsem_down_read_failed+0x14/0x30 Leftover inexact backtrace: [] ? down_read+0x17/0x20 [] btrfs_create_dio_extent+0x46/0x1e0 [btrfs] [] btrfs_get_blocks_direct+0x3d8/0x730 [btrfs] [] ? btrfs_submit_direct+0x1ce/0x740 [btrfs] [] do_blockdev_direct_IO+0x11f7/0x2bc0 [] ? btrfs_page_exists_in_range+0xe0/0xe0 [btrfs] [] ? btrfs_getattr+0xa0/0xa0 [btrfs] [] __blockdev_direct_IO+0x43/0x50 [] ? btrfs_getattr+0xa0/0xa0 [btrfs] [] btrfs_direct_IO+0x1d1/0x380 [btrfs] [] ? btrfs_getattr+0xa0/0xa0 [btrfs] [] generic_file_direct_write+0xaa/0x170 [] btrfs_file_write_iter+0x2ae/0x560 [btrfs] [] ? futex_wake+0x81/0x150 [] new_sync_write+0x84/0xb0 [] __vfs_write+0x26/0x40 [] vfs_write+0xa9/0x190 [] ? enter_from_user_mode+0x1f/0x50 [] SyS_pwrite64+0x6b/0xa0 [] ? syscall_return_slowpath+0xb0/0x130 [] entry_SYSCALL_64_fastpath+0x12/0x71 INFO: task btrfs-transacti:604 blocked for more than 120 seconds. Not tainted 4.4.34 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. btrfs-transacti D 8814e78cbe00 0 604 2 0x0008 8814e78cbe00 88017367a540 8814e2f88000 8814e78cc000 8814e78cbe38 88123616c510 8814e24c81f0 88153fb0a000 8814e78cbe18 816a8425 8814e63165a0 8814e78cbe88 Call Trace: [] schedule+0x35/0x80 [] btrfs_commit_transaction+0x275/0xa50 [btrfs] []
Re: spinning kworker with space_cache=v2 searching for free space
Am 12.11.2016 um 03:18 schrieb Liu Bo: > On Wed, Nov 09, 2016 at 09:19:21PM +0100, Stefan Priebe - Profihost AG wrote: >> Hello, >> >> found this one from 2014: >> https://patchwork.kernel.org/patch/5551651/ >> >> it this still valid? > > The space cache code doesn't change a lot, so I think the patch is still > valid to apply(there might be some conflicts though), but I'm not sure > if it could help the spinning case. Thanks got it applied and will try it. Any other ideas why it's pinning there? Free space fragmentation? But at least on one machine there are 26TB free and it's spinning... slowing down the performance. Greets, Stefan > > Thanks, > > -liubo >> >> Am 09.11.2016 um 09:09 schrieb Stefan Priebe - Profihost AG: >>> Dear list, >>> >>> even there's a lot of free space on my disk: >>> >>> # df -h /vmbackup/ >>> FilesystemSize Used Avail Use% Mounted on >>> /dev/mapper/stripe0-backup 37T 24T 13T 64% /backup >>> >>> # btrfs filesystem df /backup/ >>> Data, single: total=23.75TiB, used=22.83TiB >>> System, DUP: total=8.00MiB, used=3.94MiB >>> Metadata, DUP: total=283.50GiB, used=105.82GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> I always have a kworker process endless spinning. >>> >>> # perf top shows: >>> 47,56% [kernel] [k] rb_next >>>7,71% [kernel] [k] tree_search_offset.isra.25 >>>6,44% [kernel] [k] btrfs_find_space_for_alloc >>> >>> Mount options: >>> rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,skip_balance >>> >>> What's wrong here? >>> >>> Greets, >>> Stefan >>> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: spinning kworker with space_cache=v2 searching for free space
Hello, found this one from 2014: https://patchwork.kernel.org/patch/5551651/ it this still valid? Am 09.11.2016 um 09:09 schrieb Stefan Priebe - Profihost AG: > Dear list, > > even there's a lot of free space on my disk: > > # df -h /vmbackup/ > FilesystemSize Used Avail Use% Mounted on > /dev/mapper/stripe0-backup 37T 24T 13T 64% /backup > > # btrfs filesystem df /backup/ > Data, single: total=23.75TiB, used=22.83TiB > System, DUP: total=8.00MiB, used=3.94MiB > Metadata, DUP: total=283.50GiB, used=105.82GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > I always have a kworker process endless spinning. > > # perf top shows: > 47,56% [kernel] [k] rb_next >7,71% [kernel] [k] tree_search_offset.isra.25 >6,44% [kernel] [k] btrfs_find_space_for_alloc > > Mount options: > rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,skip_balance > > What's wrong here? > > Greets, > Stefan > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
spinning kworker with space_cache=v2 searching for free space
Dear list, even there's a lot of free space on my disk: # df -h /vmbackup/ FilesystemSize Used Avail Use% Mounted on /dev/mapper/stripe0-backup 37T 24T 13T 64% /backup # btrfs filesystem df /backup/ Data, single: total=23.75TiB, used=22.83TiB System, DUP: total=8.00MiB, used=3.94MiB Metadata, DUP: total=283.50GiB, used=105.82GiB GlobalReserve, single: total=512.00MiB, used=0.00B I always have a kworker process endless spinning. # perf top shows: 47,56% [kernel] [k] rb_next 7,71% [kernel] [k] tree_search_offset.isra.25 6,44% [kernel] [k] btrfs_find_space_for_alloc Mount options: rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,skip_balance What's wrong here? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
recover btrfs with kernel 4.9-rc3 but btrfs progs fails
Hi, currently i've an fs which triggers this one on mount while originally having 50% disk free - but btrfs progs fails too. # btrfs check --repair -p /dev/vdb1 enabling repair mode couldn't open RDWR because of unsupported option features (3). ERROR: cannot open file system [ 164.378512] BTRFS info (device vdb1): using free space tree [ 164.378513] BTRFS info (device vdb1): has skinny extents [ 205.671655] [ cut here ] [ 205.671686] WARNING: CPU: 10 PID: 4629 at fs/btrfs/extent-tree.c:2961 btrfs_run_delayed_refs+0x28d/0x2c0 [btrfs] [ 205.671689] BTRFS: error (device vdb1) in btrfs_run_delayed_refs:2961: errno=-28 No space left [ 205.671695] BTRFS: error (device vdb1) in btrfs_create_pending_block_groups:10349: errno=-28 No space left [ 205.671764] BTRFS: error (device vdb1) in btrfs_create_pending_block_groups:10353: errno=-28 No space left [ 205.671770] BTRFS: error (device vdb1) in add_block_group_free_space:1339: errno=-28 No space left [ 205.671928] BTRFS: Transaction aborted (error -28) [ 205.671929] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter ip_tables x_tables i2c_piix4 i2c_core button crc32_pclmul ghash_clmulni_intel loop btrfs xor raid6_pq usbhid ata_generic virtio_blk virtio_net uhci_hcd ehci_hcd usbcore virtio_pci usb_common ata_piix floppy [ 205.671943] CPU: 10 PID: 4629 Comm: mount Tainted: GW 4.9.0-rc3 #1 [ 205.671944] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140722_172050-sagunt 04/01/2014 [ 205.671946] b2ed110f78f0 a23f2c93 b2ed110f7940 [ 205.671948] b2ed110f7930 a20865f1 0b9129130170 8b4e7f1bc000 [ 205.671949] 8b4e29130170 8b4e27e5f000 003105b4 [ 205.671951] Call Trace: [ 205.671956] [] dump_stack+0x63/0x90 [ 205.671960] [] __warn+0xd1/0xf0 [ 205.671962] [] warn_slowpath_fmt+0x4f/0x60 [ 205.671972] [] btrfs_run_delayed_refs+0x28d/0x2c0 [btrfs] [ 205.671982] [] btrfs_commit_transaction+0x29/0x70 [btrfs] [ 205.671993] [] btrfs_recover_log_trees+0x3b3/0x440 [btrfs] [ 205.672004] [] ? replay_one_extent+0x730/0x730 [btrfs] [ 205.672013] [] open_ctree+0x264d/0x2760 [btrfs] [ 205.672020] [] btrfs_mount+0xcc7/0xe00 [btrfs] [ 205.672023] [] ? pcpu_next_unpop+0x40/0x50 [ 205.672025] [] ? find_next_bit+0x15/0x20 [ 205.672026] [] ? pcpu_alloc+0x32d/0x620 [ 205.672028] [] mount_fs+0x15/0x90 [ 205.672030] [] vfs_kern_mount+0x67/0x110 [ 205.672037] [] btrfs_mount+0x2ac/0xe00 [btrfs] [ 205.672039] [] ? pcpu_next_unpop+0x40/0x50 [ 205.672040] [] ? find_next_bit+0x15/0x20 [ 205.672041] [] mount_fs+0x15/0x90 [ 205.672042] [] vfs_kern_mount+0x67/0x110 [ 205.672044] [] do_mount+0x192/0xc30 [ 205.672045] [] ? memdup_user+0x42/0x60 [ 205.672046] [] SyS_mount+0x94/0xd0 [ 205.672048] [] do_syscall_64+0x69/0x200 [ 205.672049] [] entry_SYSCALL64_slow_path+0x25/0x25 [ 205.672050] ---[ end trace be50fce8648d2575 ]--- [ 205.672052] BTRFS: error (device vdb1) in btrfs_run_delayed_refs:2961: errno=-28 No space left [ 205.672109] BTRFS: error (device vdb1) in btrfs_replay_log:2491: errno=-28 No space left (Failed to recover log tree) [ 206.061658] BTRFS error (device vdb1): pending csums is 643072 [ 206.061801] BTRFS error (device vdb1): cleaner transaction attach returned -30 [ 206.577900] BTRFS error (device vdb1): open_ctree failed Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hello list, just wanted to report that my ENOSPC errors are gone. Thanks to wang for his great patches. but the space_info corruption still occours. On every umount i see: [93022.166222] BTRFS: space_info 4 has 208952672256 free, is not full [93022.166224] BTRFS: space_info total=363998478336, used=155045216256, pinned=0, reserved=0, may_use=524288, readonly=65536 Greets, Stefan Am 29.09.2016 um 09:27 schrieb Stefan Priebe - Profihost AG: > Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>> currently >>> I cannot confirm that as i do not have anough space to test this without >>> compression ;-( But yes i've compression enabled. >> I might not get you, my poor english :) >> You mean that you only get ENOSPC error when compression is enabled? >> >> And when compression is not enabled, you do not get ENOSPC error? > > I can't tell you. I cannot test with compression not enabled. I do not > have anough free space on this disk. > >>>> I'm trying to fix it. >>> That sounds good but do you also get the >>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>> >>> kernel messages on umount? if not you might have found another problem. >> Yes, I seem similar messages, you can paste you whole dmesg info here. > > [ cut here ] > WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 > btrfs_free_block_groups+0x346/0x430 [btrfs]() > Modules linked in: netconsole xt_multiport iptable_filter ip_tables > x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm > irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan > ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop > btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov > async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit > i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd > sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid > CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 > Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 > 880fda777d00 813b69c3 > c067a099 880fda777d38 810821c6 > 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 > Call Trace: > [] dump_stack+0x63/0x90 > [] warn_slowpath_common+0x86/0xc0 > [] warn_slowpath_null+0x1a/0x20 > [] btrfs_free_block_groups+0x346/0x430 [btrfs] > [] close_ctree+0x15d/0x330 [btrfs] > [] btrfs_put_super+0x19/0x20 [btrfs] > [] generic_shutdown_super+0x6f/0x100 > [] kill_anon_super+0x12/0x20 > [] btrfs_kill_super+0x16/0xa0 [btrfs] > [] deactivate_locked_super+0x43/0x70 > [] deactivate_super+0x5c/0x60 > [] cleanup_mnt+0x3f/0x90 > [] __cleanup_mnt+0x12/0x20 > [] task_work_run+0x81/0xa0 > [] exit_to_usermode_loop+0xb0/0xc0 > [] syscall_return_slowpath+0xd4/0x130 > [] int_ret_from_sys_call+0x25/0x8f > ---[ end trace cee6ace13018e13e ]--- > [ cut here ] > WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 > btrfs_free_block_groups+0x365/0x430 [btrfs]() > Modules linked in: netconsole xt_multiport iptable_filter ip_tables > x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm > irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan > ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop > btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov > async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit > i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd > sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid > CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 > Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 > 880fda777d00 813b69c3 > c067a099 880fda777d38 810821c6 > 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 > Call Trace: > [] dump_stack+0x63/0x90 > [] warn_slowpath_common+0x86/0xc0 > [] warn_slowpath_null+0x1a/0x20 > [] btrfs_free_block_groups+0x365/0x430 [btrfs] > [] close_ctree+0x15d/0x330 [btrfs] > [] btrfs_put_super+0x19/0x20 [btrfs] > [] generic_shutdown_super+0x6f/0x100 > [] kill_anon_super+0x12/0x20 > [] btrfs_kill_super+0x16/0xa0 [btrfs] > [] deactivate_locked_super+0x43/0x70 > [] deactivate_super+0x5c/0x60 > [] cleanup_mnt+0x3f/0x90 > [] __cleanup_mnt+0x12/0x20 > [] task_work_run+0x81/0xa0 > [] exit_to_usermode_loop+0xb0/0xc0 > [] syscall_return_slowpath+0xd4/0x130 > [] in
Re: [PATCH 1/2] btrfs: improve inode's outstanding_extents computation
Hello list, just want to report again that i've seen not a single ENOSPC msg with this series applied. Now working fine since 18 days. Stefan Am 14.10.2016 um 15:09 schrieb Stefan Priebe - Profihost AG: > > Am 06.10.2016 um 04:51 schrieb Wang Xiaoguang: >> This issue was revealed by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, >> When modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often >> gets these warnings from btrfs_destroy_inode(): >> WARN_ON(BTRFS_I(inode)->outstanding_extents); >> WARN_ON(BTRFS_I(inode)->reserved_extents); >> >> Simple test program below can reproduce this issue steadily. >> Note: you need to modify BTRFS_MAX_EXTENT_SIZE to 64KB to have test, >> otherwise there won't be such WARNING. >> #include >> #include >> #include >> #include >> #include >> >> int main(void) >> { >> int fd; >> char buf[68 *1024]; >> >> memset(buf, 0, 68 * 1024); >> fd = open("testfile", O_CREAT | O_EXCL | O_RDWR); >> pwrite(fd, buf, 68 * 1024, 64 * 1024); >> return; >> } >> >> When BTRFS_MAX_EXTENT_SIZE is 64KB, and buffered data range is: >> 64KB 128K132KB >> |---|---| >> 64 + 4KB >> >> 1) for above data range, btrfs_delalloc_reserve_metadata() will reserve >> metadata and set BTRFS_I(inode)->outstanding_extents to 2. >> (68KB + 64KB - 1) / 64KB == 2 >> >> Outstanding_extents: 2 >> >> 2) then btrfs_dirty_page() will be called to dirty pages and set >> EXTENT_DELALLOC flag. In this case, btrfs_set_bit_hook() will be called >> twice. >> The 1st set_bit_hook() call will set DEALLOC flag for the first 64K. >> 64KB 128KB >> |---| >> 64KB DELALLOC >> Outstanding_extents: 2 >> >> Set_bit_hooks() uses FIRST_DELALLOC flag to avoid re-increase >> outstanding_extents counter. >> So for 1st set_bit_hooks() call, it won't modify outstanding_extents, >> it's still 2. >> >> Then FIRST_DELALLOC flag is *CLEARED*. >> >> 3) 2nd btrfs_set_bit_hook() call. >> Because FIRST_DELALLOC have been cleared by previous set_bit_hook(), >> btrfs_set_bit_hook() will increase BTRFS_I(inode)->outstanding_extents by >> one, so now BTRFS_I(inode)->outstanding_extents is 3. >> 64KB128KB132KB >> |---|| >> 64K DELALLOC 4K DELALLOC >> Outstanding_extents: 3 >> >> But the correct outstanding_extents number should be 2, not 3. >> The 2nd btrfs_set_bit_hook() call just screwed up this, and leads to the >> WARN_ON(). >> >> Normally, we can solve it by only increasing outstanding_extents in >> set_bit_hook(). >> But the problem is for delalloc_reserve/release_metadata(), we only have >> a 'length' parameter, and calculate in-accurate outstanding_extents. >> If we only rely on set_bit_hook() release_metadata() will crew things up >> as it will decrease inaccurate number. >> >> So the fix we use is: >> 1) Increase *INACCURATE* outstanding_extents at delalloc_reserve_meta >>Just as a place holder. >> 2) Increase *accurate* outstanding_extents at set_bit_hooks() >>This is the real increaser. >> 3) Decrease *INACCURATE* outstanding_extents before returning >>This makes outstanding_extents to correct value. >> >> For 128M BTRFS_MAX_EXTENT_SIZE, due to limitation of >> __btrfs_buffered_write(), each iteration will only handle about 2MB >> data. >> So btrfs_dirty_pages() won't need to handle cases cross 2 extents. >> >> Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com> > > Tested-by: Stefan Priebe <s.pri...@profihost.ag> > > Works fine since 8 days - no ENOSPC errors anymore. > > Greets, > Stefan > >> --- >> fs/btrfs/ctree.h | 2 ++ >> fs/btrfs/inode.c | 65 >> ++-- >> fs/btrfs/ioctl.c | 6 ++ >> 3 files changed, 62 insertions(+), 11 deletions(-) >> >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >> index 33fe035..16885f6 100644 >> --- a/fs/btrfs/ctree.h >> +++ b/fs/btrfs/ctree.h >> @@ -3119,6 +3119,
Re: speed up cp --reflink=always
Am 17.10.2016 um 03:50 schrieb Qu Wenruo: > At 10/17/2016 02:54 AM, Stefan Priebe - Profihost AG wrote: >> Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg: >>> Hi, >>> >>> On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote: >>>> >>>> cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes) >>>> >>>> An example: >>>> >>>> source file: >>>> # ls -la vm-279-disk-1.img >>>> -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img >>>> >>>> target file after around 10 minutes: >>>> # ls -la vm-279-disk-1.img.tmp >>>> -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp >>> >>> Two quick thoughts: >>> 1. How many extents does this img have? >> >> filefrag says: >> 1011508 extents found > > Too many fragments. > Average extent size is only about 200K. > Quite common for VM images, if not setting no copy-on-write (C) attr. > > Normally it's not a good idea to put VM images into btrfs without any > tuning. Those are backups just written sequentially once. As far as i know the extent size is hardcoded to 128k for compression. Isn't it? Stefan > Thanks, > Qu >> >>> 2. Is this an XY problem? Why not just put the img in a subvolume and >>> snapshot that? >> >> Sorry what's XY problem? >> >> Implementing cp reflink was easier - as the original code was based on >> XFS. But shouldn't be cp reflink / clone a file be nearly identical to a >> snapshot? Just creating refs to the extents? >> >> Greets, >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: speed up cp --reflink=always
Am 16.10.2016 um 21:48 schrieb Hans van Kranenburg: > On 10/16/2016 08:54 PM, Stefan Priebe - Profihost AG wrote: >> Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg: >>> On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote: >>>> >>>> cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes) >>>> >>>> An example: >>>> >>>> source file: >>>> # ls -la vm-279-disk-1.img >>>> -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img >>>> >>>> target file after around 10 minutes: >>>> # ls -la vm-279-disk-1.img.tmp >>>> -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp >>> >>> Two quick thoughts: >>> 1. How many extents does this img have? >> >> filefrag says: >> 1011508 extents found > > To cp --reflink this, the filesystem needs to create a million new > EXTENT_DATA objects for the new file, which point all parts of the new > file to all the little same parts of the old file, and probably also > needs to update a million EXTENT_DATA objects in the btrees to add a > second backreference back to the new file. Thanks for this explanation. > >>> 2. Is this an XY problem? Why not just put the img in a subvolume and >>> snapshot that? >> >> Sorry what's XY problem? > > It means that I suspected that your actual goal is not spending time to > work on optimizing how cp --reflink works, but that you just want to use > the quickest way to have a clone of the file. > > An XY problem is when someone has problem X, then thinks about solution > Y to solve it, then runs into a problem/limitation/whatever when trying > Y and asks help with that actual problem when doing Y while there might > in the end be a better solution to get X done. ah ;-) makes sense. >> Implementing cp reflink was easier - as the original code was based on >> XFS. But shouldn't be cp reflink / clone a file be nearly identical to a >> snapshot? Just creating refs to the extents? > > Snapshotting a subvolume only has to write a cowed copy of the top-level > information of the subvolume filesystem tree, and leaves the extent tree > alone. It doesn't have to do 2 million different things. \o/ Thanks for this explanation. Will look into switching to subvolumes. Wasn't able todo this before as i was always running into ENOSPC issues which was solved last week. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: speed up cp --reflink=always
Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg: > Hi, > > On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote: >> >> cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes) >> >> An example: >> >> source file: >> # ls -la vm-279-disk-1.img >> -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img >> >> target file after around 10 minutes: >> # ls -la vm-279-disk-1.img.tmp >> -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp > > Two quick thoughts: > 1. How many extents does this img have? filefrag says: 1011508 extents found > 2. Is this an XY problem? Why not just put the img in a subvolume and > snapshot that? Sorry what's XY problem? Implementing cp reflink was easier - as the original code was based on XFS. But shouldn't be cp reflink / clone a file be nearly identical to a snapshot? Just creating refs to the extents? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
speed up cp --reflink=always
Hello, cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes) An example: source file: # ls -la vm-279-disk-1.img -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img target file after around 10 minutes: # ls -la vm-279-disk-1.img.tmp -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp I/O Waits are at around 6% but disk usage is at around 100%. The process using most of the disk I/O is a kworker process. A function trace of this kworker for 30s is already 44MB - no idea where to upload. This volume uses space_cache=v2. While digging through it i see a lot of this calls: kworker/u65:4-20679 [007] 46021.641882: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641882: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641883: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641884: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641885: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641885: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641885: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641885: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641885: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641885: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641886: btrfs_set_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641886: btrfs_get_token_32 <-btrfs_del_items kworker/u65:4-20679 [007] 46021.641886: btrfs_set_token_32 <-btrfs_del_items Sorting the calls shows: 4892 _raw_spin_lock <-free_extent_buffer 4894 release_extent_buffer <-free_extent_buffer 6803 map_private_extent_buffer <-generic_bin_search.constprop.36 6839 __set_page_dirty_nobuffers <-btree_set_page_dirty 6840 btree_set_page_dirty <-set_page_dirty 6840 mem_cgroup_begin_page_stat <-__set_page_dirty_nobuffers 6840 page_mapping <-set_page_dirty 6840 set_page_dirty <-set_extent_buffer_dirty 6841 mem_cgroup_end_page_stat <-__set_page_dirty_nobuffers 7521 btrfs_clear_lock_blocking_rw <-btrfs_clear_path_blocking 7967 btrfs_get_token_64 <-read_block_for_search.isra.33 8018 btrfs_set_token_32 <-btrfs_del_items 8235 btrfs_get_token_32 <-btrfs_del_items 8813 btrfs_set_lock_blocking_rw <-btrfs_set_path_blocking 9235 map_private_extent_buffer <-btrfs_get_token_32 11824 btrfs_set_token_32 <-btrfs_extend_item 12090 map_private_extent_buffer <-btrfs_get_token_64 12367 mark_page_accessed <-mark_extent_buffer_accessed 12621 btrfs_get_token_32 <-btrfs_extend_item 16267
Re: btrfs and numa - needing drop_caches to keep speed up
Hi, Am 14.10.2016 um 15:19 schrieb Stefan Priebe - Profihost AG: > Dear julian, > > Am 14.10.2016 um 14:26 schrieb Julian Taylor: >> On 10/14/2016 08:28 AM, Stefan Priebe - Profihost AG wrote: >>> Hello list, >>> >>> while running the same workload on two machines (single xeon and a dual >>> xeon) both with 64GB RAM. >>> >>> I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to >>> keep the speed as good as on the non numa system. I'm not sure whether >>> this is related to numa. >>> >>> Is there any sysctl parameter to tune? >>> >>> Tested with vanilla v4.8.1 >>> >>> Greets, >>> Stefan >> >> hi, >> why do you think this is related to btrfs? > > was just an idea as i couldn't find any other difference between those > systems. > >> This is easy to diagnose but recording some kernel stacks during the > > problem with perf. > > you just mean perf top? Does it also show locking problems? As i see not > much CPU usage in that case. perf top looks like this: 5,46% libc-2.19.so [.] memset 5,26% [kernel] [k] page_fault 3,63% [kernel] [k] clear_page_c_e 1,38% [kernel] [k] _raw_spin_lock 1,06% [kernel] [k] get_page_from_freelist 0,83% [kernel] [k] copy_user_enhanced_fast_string 0,79% [kernel] [k] release_pages 0,68% [kernel] [k] handle_mm_fault 0,57% [kernel] [k] free_hot_cold_page 0,55% [kernel] [k] handle_pte_fault 0,54% [kernel] [k] __pagevec_lru_add_fn 0,45% [kernel] [k] unmap_page_range 0,45% [kernel] [k] __mod_zone_page_state 0,43% [kernel] [k] page_add_new_anon_rmap 0,38% [kernel] [k] free_pcppages_bulk > >> The only known issue that has this type of workaround that I know of are >> transparent huge pages. > > I already disabled thp by: > echo never > /sys/kernel/mm/transparent_hugepage/enabled > > cat /proc/meminfo says: > HugePages_Total: 0 > HugePages_Free:0 > HugePages_Rsvd:0 > HugePages_Surp:0 > > > > Greets, > Stefan > >> >> cheers, >> Julian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs and numa - needing drop_caches to keep speed up
Dear julian, Am 14.10.2016 um 14:26 schrieb Julian Taylor: > On 10/14/2016 08:28 AM, Stefan Priebe - Profihost AG wrote: >> Hello list, >> >> while running the same workload on two machines (single xeon and a dual >> xeon) both with 64GB RAM. >> >> I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to >> keep the speed as good as on the non numa system. I'm not sure whether >> this is related to numa. >> >> Is there any sysctl parameter to tune? >> >> Tested with vanilla v4.8.1 >> >> Greets, >> Stefan > > hi, > why do you think this is related to btrfs? was just an idea as i couldn't find any other difference between those systems. > This is easy to diagnose but recording some kernel stacks during the > problem with perf. you just mean perf top? Does it also show locking problems? As i see not much CPU usage in that case. > The only known issue that has this type of workaround that I know of are > transparent huge pages. I already disabled thp by: echo never > /sys/kernel/mm/transparent_hugepage/enabled cat /proc/meminfo says: HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Greets, Stefan > > cheers, > Julian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs: fix false enospc for compression
Am 06.10.2016 um 04:51 schrieb Wang Xiaoguang: > When testing btrfs compression, sometimes we got ENOSPC error, though fs > still has much free space, xfstests generic/171, generic/172, generic/173, > generic/174, generic/175 can reveal this bug in my test environment when > compression is enabled. > > After some debuging work, we found that it's btrfs_delalloc_reserve_metadata() > which sometimes tries to reserve plenty of metadata space, even for very small > data range. In btrfs_delalloc_reserve_metadata(), the number of metadata bytes > we try to reserve is calculated by the difference between outstanding_extents > and reserved_extents. Please see below case for how ENOSPC occurs: > > 1, Buffered write 128MB data in unit of 128KB, so finially we'll have inode > outstanding extents be 1, and reserved_extents be 1024. Note it's > btrfs_merge_extent_hook() that merges these 128KB units into one big > outstanding extent, but do not change reserved_extents. > > 2, When writing dirty pages, for compression, cow_file_range_async() will > split above big extent in unit of 128KB(compression extent size is 128KB). > When first split opeartion finishes, we'll have 2 outstanding extents and 1024 > reserved extents, and just right now the currently generated ordered extent is > dispatched to run and complete, then btrfs_delalloc_release_metadata()(see > btrfs_finish_ordered_io()) will be called to release metadata, after that we > will have 1 outstanding extents and 1 reserved extents(also see logic in > drop_outstanding_extent()). Later cow_file_range_async() continues to handles > left data range[128KB, 128MB), and if no other ordered extent was dispatched > to run, there will be 1023 outstanding extents and 1 reserved extent. > > 3, Now if another bufferd write for this file enters, then > btrfs_delalloc_reserve_metadata() will at least try to reserve metadata > for 1023 outstanding extents' metadata, for 16KB node size, it'll be > 1023*16384*2*8, > about 255MB, for 64K node size, it'll be 1023*65536*8*2, about 1GB metadata, > so > obviously it's not sane and can easily result in enospc error. > > The root cause is that for compression, its max extent size will no longer be > BTRFS_MAX_EXTENT_SIZE(128MB), it'll be 128KB, so current metadata reservation > method in btrfs is not appropriate or correct, here we introduce: > enum btrfs_metadata_reserve_type { > BTRFS_RESERVE_NORMAL, > BTRFS_RESERVE_COMPRESS, > }; > and expand btrfs_delalloc_reserve_metadata() and > btrfs_delalloc_reserve_space() > by adding a new enum btrfs_metadata_reserve_type argument. When a data range > will > go through compression, we use BTRFS_RESERVE_COMPRESS to reserve metatata. > Meanwhile we introduce EXTENT_COMPRESS flag to mark a data range that will go > through compression path. > > With this patch, we can fix these false enospc error for compression. > > Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com> Tested-by: Stefan Priebe <s.pri...@profihost.ag> Works fine since 8 days - no ENOSPC errors anymore. Greets, Stefan > --- > fs/btrfs/ctree.h | 31 ++-- > fs/btrfs/extent-tree.c | 55 + > fs/btrfs/extent_io.c | 59 +- > fs/btrfs/extent_io.h | 2 + > fs/btrfs/file.c | 26 +-- > fs/btrfs/free-space-cache.c | 6 +- > fs/btrfs/inode-map.c | 5 +- > fs/btrfs/inode.c | 181 > --- > fs/btrfs/ioctl.c | 12 ++- > fs/btrfs/relocation.c| 14 +++- > fs/btrfs/tests/inode-tests.c | 15 ++-- > 11 files changed, 309 insertions(+), 97 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 16885f6..fa6a19a 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -97,6 +97,19 @@ static const int btrfs_csum_sizes[] = { 4 }; > > #define BTRFS_DIRTY_METADATA_THRESH SZ_32M > > +/* > + * for compression, max file extent size would be limited to 128K, so when > + * reserving metadata for such delalloc writes, pass BTRFS_RESERVE_COMPRESS > to > + * btrfs_delalloc_reserve_metadata() or btrfs_delalloc_reserve_space() to > + * calculate metadata, for none-compression, use BTRFS_RESERVE_NORMAL. > + */ > +enum btrfs_metadata_reserve_type { > + BTRFS_RESERVE_NORMAL, > + BTRFS_RESERVE_COMPRESS, > +}; > +int inode_need_compress(struct inode *inode); > +u64 btrfs_max_extent_size(enum btrfs_metadata_reserve_type reserve_type); > + > #define BTRFS_MAX_EXTENT_SIZE SZ_128M > > struct btrfs_mapping_tree { > @@ -2677,10 +2690,14 @@ int btrfs_subvolume_reserve_metadata(struct &g
Re: [PATCH 1/2] btrfs: improve inode's outstanding_extents computation
Am 06.10.2016 um 04:51 schrieb Wang Xiaoguang: > This issue was revealed by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, > When modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often > gets these warnings from btrfs_destroy_inode(): > WARN_ON(BTRFS_I(inode)->outstanding_extents); > WARN_ON(BTRFS_I(inode)->reserved_extents); > > Simple test program below can reproduce this issue steadily. > Note: you need to modify BTRFS_MAX_EXTENT_SIZE to 64KB to have test, > otherwise there won't be such WARNING. > #include > #include > #include > #include > #include > > int main(void) > { > int fd; > char buf[68 *1024]; > > memset(buf, 0, 68 * 1024); > fd = open("testfile", O_CREAT | O_EXCL | O_RDWR); > pwrite(fd, buf, 68 * 1024, 64 * 1024); > return; > } > > When BTRFS_MAX_EXTENT_SIZE is 64KB, and buffered data range is: > 64KB 128K132KB > |---|---| > 64 + 4KB > > 1) for above data range, btrfs_delalloc_reserve_metadata() will reserve > metadata and set BTRFS_I(inode)->outstanding_extents to 2. > (68KB + 64KB - 1) / 64KB == 2 > > Outstanding_extents: 2 > > 2) then btrfs_dirty_page() will be called to dirty pages and set > EXTENT_DELALLOC flag. In this case, btrfs_set_bit_hook() will be called > twice. > The 1st set_bit_hook() call will set DEALLOC flag for the first 64K. > 64KB 128KB > |---| > 64KB DELALLOC > Outstanding_extents: 2 > > Set_bit_hooks() uses FIRST_DELALLOC flag to avoid re-increase > outstanding_extents counter. > So for 1st set_bit_hooks() call, it won't modify outstanding_extents, > it's still 2. > > Then FIRST_DELALLOC flag is *CLEARED*. > > 3) 2nd btrfs_set_bit_hook() call. > Because FIRST_DELALLOC have been cleared by previous set_bit_hook(), > btrfs_set_bit_hook() will increase BTRFS_I(inode)->outstanding_extents by > one, so now BTRFS_I(inode)->outstanding_extents is 3. > 64KB128KB132KB > |---|| > 64K DELALLOC 4K DELALLOC > Outstanding_extents: 3 > > But the correct outstanding_extents number should be 2, not 3. > The 2nd btrfs_set_bit_hook() call just screwed up this, and leads to the > WARN_ON(). > > Normally, we can solve it by only increasing outstanding_extents in > set_bit_hook(). > But the problem is for delalloc_reserve/release_metadata(), we only have > a 'length' parameter, and calculate in-accurate outstanding_extents. > If we only rely on set_bit_hook() release_metadata() will crew things up > as it will decrease inaccurate number. > > So the fix we use is: > 1) Increase *INACCURATE* outstanding_extents at delalloc_reserve_meta >Just as a place holder. > 2) Increase *accurate* outstanding_extents at set_bit_hooks() >This is the real increaser. > 3) Decrease *INACCURATE* outstanding_extents before returning >This makes outstanding_extents to correct value. > > For 128M BTRFS_MAX_EXTENT_SIZE, due to limitation of > __btrfs_buffered_write(), each iteration will only handle about 2MB > data. > So btrfs_dirty_pages() won't need to handle cases cross 2 extents. > > Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com> Tested-by: Stefan Priebe <s.pri...@profihost.ag> Works fine since 8 days - no ENOSPC errors anymore. Greets, Stefan > --- > fs/btrfs/ctree.h | 2 ++ > fs/btrfs/inode.c | 65 > ++-- > fs/btrfs/ioctl.c | 6 ++ > 3 files changed, 62 insertions(+), 11 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 33fe035..16885f6 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -3119,6 +3119,8 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info > *fs_info, int delay_iput, > int nr); > int btrfs_set_extent_delalloc(struct inode *inode, u64 start, u64 end, > struct extent_state **cached_state); > +int btrfs_set_extent_defrag(struct inode *inode, u64 start, u64 end, > + struct extent_state **cached_state); > int btrfs_create_subvol_root(struct btrfs_trans_handle *trans, >struct btrfs_root *new_root, >struct btrfs_root *parent_r
btrfs and numa - needing drop_caches to keep speed up
Hello list, while running the same workload on two machines (single xeon and a dual xeon) both with 64GB RAM. I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to keep the speed as good as on the non numa system. I'm not sure whether this is related to numa. Is there any sysctl parameter to tune? Tested with vanilla v4.8.1 Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Dear Wang, Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: > Hi, > > On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>> currently >>>> I cannot confirm that as i do not have anough space to test this >>>> without >>>> compression ;-( But yes i've compression enabled. >>> I might not get you, my poor english :) >>> You mean that you only get ENOSPC error when compression is enabled? >>> >>> And when compression is not enabled, you do not get ENOSPC error? >> I can't tell you. I cannot test with compression not enabled. I do not >> have anough free space on this disk. > I had just sent two patches to fix false enospc error for compression, > please have a try, they fix false enospc error in my test environment. > btrfs: fix false enospc for compression > btrfs: improve inode's outstanding_extents computation > > I apply these two patchs in linux upstream tree, the latest commit > is 41844e36206be90cd4d962ea49b0abc3612a99d0. no space errors since 5 days! that's currently amazing. I Hope it stays this and your patches get into 4.9. Greets, Stefan > > Regards, > Xiaoguang Wang > >> >>>>> I'm trying to fix it. >>>> That sounds good but do you also get the >>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> >>>> kernel messages on umount? if not you might have found another problem. >>> Yes, I seem similar messages, you can paste you whole dmesg info here. >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >> btrfs_free_block_groups+0x346/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >> [] warn_slowpath_null+0x1a/0x20 >> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >> [] close_ctree+0x15d/0x330 [btrfs] >> [] btrfs_put_super+0x19/0x20 [btrfs] >> [] generic_shutdown_super+0x6f/0x100 >> [] kill_anon_super+0x12/0x20 >> [] btrfs_kill_super+0x16/0xa0 [btrfs] >> [] deactivate_locked_super+0x43/0x70 >> [] deactivate_super+0x5c/0x60 >> [] cleanup_mnt+0x3f/0x90 >> [] __cleanup_mnt+0x12/0x20 >> [] task_work_run+0x81/0xa0 >> [] exit_to_usermode_loop+0xb0/0xc0 >> [] syscall_return_slowpath+0xd4/0x130 >> [] int_ret_from_sys_call+0x25/0x8f >> ---[ end trace cee6ace13018e13e ]--- >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 >> btrfs_free_block_groups+0x365/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 &
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
main difference between the system where oom happens is: - Single Xeon => no OOM - Dual Xeon / NUMA => OOM both 64GB mem. Am 07.10.2016 um 11:33 schrieb Holger Hoffstätte: > On 10/07/16 09:17, Wang Xiaoguang wrote: >> Hi, >> >> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote: >>> Dear Wang, >>> >>> can't use v4.8.0 as i always get OOMs and total machine crashes. >>> >>> Complete traces with your patch and some more btrfs patches applied (in >>> the hope in fixes the OOM but it did not): >>> http://pastebin.com/raw/6vmRSDm1 >> I didn't see any such OOMs... >> Can you try holger's tree with my patches. > > They don't really apply to either 4.4.x (because it has diverged too > much by now) or 4.8.x because of the initial dedupe support which came > in as part of 4.9rc1 - there are way too many conflicts all over the > place and merging them manually took way too much time. > It would be useful if you could rebase your patches to for-next. > > Stefan, have you tried setting THP to 'madvise' or 'never'? > Try 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' > or boot with transparent_hugepage=madvise (or never) kernel flag. > I have no idea if it will help, but it's worth a try. > > -h > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi Wang, currently on the system where it's working fine - no ENOSPC error. But it will take a week to be sure they don't come back. Thanks! Greets, Stefan Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: > Hi, > > On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>> currently >>>> I cannot confirm that as i do not have anough space to test this >>>> without >>>> compression ;-( But yes i've compression enabled. >>> I might not get you, my poor english :) >>> You mean that you only get ENOSPC error when compression is enabled? >>> >>> And when compression is not enabled, you do not get ENOSPC error? >> I can't tell you. I cannot test with compression not enabled. I do not >> have anough free space on this disk. > I had just sent two patches to fix false enospc error for compression, > please have a try, they fix false enospc error in my test environment. > btrfs: fix false enospc for compression > btrfs: improve inode's outstanding_extents computation > > I apply these two patchs in linux upstream tree, the latest commit > is 41844e36206be90cd4d962ea49b0abc3612a99d0. > > Regards, > Xiaoguang Wang > >> >>>>> I'm trying to fix it. >>>> That sounds good but do you also get the >>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> >>>> kernel messages on umount? if not you might have found another problem. >>> Yes, I seem similar messages, you can paste you whole dmesg info here. >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >> btrfs_free_block_groups+0x346/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >> [] warn_slowpath_null+0x1a/0x20 >> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >> [] close_ctree+0x15d/0x330 [btrfs] >> [] btrfs_put_super+0x19/0x20 [btrfs] >> [] generic_shutdown_super+0x6f/0x100 >> [] kill_anon_super+0x12/0x20 >> [] btrfs_kill_super+0x16/0xa0 [btrfs] >> [] deactivate_locked_super+0x43/0x70 >> [] deactivate_super+0x5c/0x60 >> [] cleanup_mnt+0x3f/0x90 >> [] __cleanup_mnt+0x12/0x20 >> [] task_work_run+0x81/0xa0 >> [] exit_to_usermode_loop+0xb0/0xc0 >> [] syscall_return_slowpath+0xd4/0x130 >> [] int_ret_from_sys_call+0x25/0x8f >> ---[ end trace cee6ace13018e13e ]--- >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 >> btrfs_free_block_groups+0x365/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowp
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi Holger, Am 07.10.2016 um 11:33 schrieb Holger Hoffstätte: > On 10/07/16 09:17, Wang Xiaoguang wrote: >> Hi, >> >> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote: >>> Dear Wang, >>> >>> can't use v4.8.0 as i always get OOMs and total machine crashes. >>> >>> Complete traces with your patch and some more btrfs patches applied (in >>> the hope in fixes the OOM but it did not): >>> http://pastebin.com/raw/6vmRSDm1 >> I didn't see any such OOMs... >> Can you try holger's tree with my patches. > > They don't really apply to either 4.4.x (because it has diverged too > much by now) or 4.8.x because of the initial dedupe support which came > in as part of 4.9rc1 - there are way too many conflicts all over the > place and merging them manually took way too much time. > It would be useful if you could rebase your patches to for-next. > > Stefan, have you tried setting THP to 'madvise' or 'never'? > Try 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' > or boot with transparent_hugepage=madvise (or never) kernel flag. > I have no idea if it will help, but it's worth a try. It's already set to never. The hosts are currently still up and running but only if i run echo 3 >/proc/sys/vm/drop_caches every 30 minutes. It seems the kernel fails to reclaim the cache itself if user space needs memory. Greets, Stefan > > -h > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 07.10.2016 um 10:07 schrieb Wang Xiaoguang: > hello, > > On 10/07/2016 04:06 PM, Stefan Priebe - Profihost AG wrote: >> and it shows: >> >> PAG | scan 33829e5 | steal 1968e3 | stall 0 | | >>| | swin 257071 | swout 346960 | >> >> but the highest user space prog uses only 190MB. > If you don't apply my patches, there will be no OOMs in your test > environment? > I want to confirm whether this OOM is caused by my patches... This happens also without your patches. That's what i meant with can't use v4.8.0. Is it OK to try v4.7.6? Greets, Stefan > > Regards, > Xiaoguang Wang > >> >> greets, >> Stefan >> >> Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang: >>> Hi, >>> >>> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote: >>>> Dear Wang, >>>> >>>> can't use v4.8.0 as i always get OOMs and total machine crashes. >>>> >>>> Complete traces with your patch and some more btrfs patches applied (in >>>> the hope in fixes the OOM but it did not): >>>> http://pastebin.com/raw/6vmRSDm1 >>> I didn't see any such OOMs... >>> Can you try holger's tree with my patches. >>> >>> Regards, >>> Xiaoguang Wang >>>> Greets, >>>> Stefan >>>> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: >>>>> Hi, >>>>> >>>>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >>>>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>>>>>> I found that compress sometime report ENOSPC error even in >>>>>>>>> 4.8-rc8, >>>>>>>>> currently >>>>>>>> I cannot confirm that as i do not have anough space to test this >>>>>>>> without >>>>>>>> compression ;-( But yes i've compression enabled. >>>>>>> I might not get you, my poor english :) >>>>>>> You mean that you only get ENOSPC error when compression is enabled? >>>>>>> >>>>>>> And when compression is not enabled, you do not get ENOSPC error? >>>>>> I can't tell you. I cannot test with compression not enabled. I do >>>>>> not >>>>>> have anough free space on this disk. >>>>> I had just sent two patches to fix false enospc error for compression, >>>>> please have a try, they fix false enospc error in my test environment. >>>>> btrfs: fix false enospc for compression >>>>> btrfs: improve inode's outstanding_extents computation >>>>> >>>>> I apply these two patchs in linux upstream tree, the latest commit >>>>> is 41844e36206be90cd4d962ea49b0abc3612a99d0. >>>>> >>>>> Regards, >>>>> Xiaoguang Wang >>>>> >>>>>>>>> I'm trying to fix it. >>>>>>>> That sounds good but do you also get the >>>>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>>>>>> >>>>>>>> kernel messages on umount? if not you might have found another >>>>>>>> problem. >>>>>>> Yes, I seem similar messages, you can paste you whole dmesg info >>>>>>> here. >>>>>> [ cut here ] >>>>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >>>>>> btrfs_free_block_groups+0x346/0x430 [btrfs]() >>>>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >>>>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp >>>>>> kvm_intel kvm >>>>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >>>>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >>>>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >>>>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb >>>>>> i2c_algo_bit >>>>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >>>>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >>>>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >>>>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >>>>>>
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
and it shows: PAG | scan 33829e5 | steal 1968e3 | stall 0 | | | | swin 257071 | swout 346960 | but the highest user space prog uses only 190MB. greets, Stefan Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang: > Hi, > > On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote: >> Dear Wang, >> >> can't use v4.8.0 as i always get OOMs and total machine crashes. >> >> Complete traces with your patch and some more btrfs patches applied (in >> the hope in fixes the OOM but it did not): >> http://pastebin.com/raw/6vmRSDm1 > I didn't see any such OOMs... > Can you try holger's tree with my patches. > > Regards, > Xiaoguang Wang >> >> Greets, >> Stefan >> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: >>> Hi, >>> >>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>>>> currently >>>>>> I cannot confirm that as i do not have anough space to test this >>>>>> without >>>>>> compression ;-( But yes i've compression enabled. >>>>> I might not get you, my poor english :) >>>>> You mean that you only get ENOSPC error when compression is enabled? >>>>> >>>>> And when compression is not enabled, you do not get ENOSPC error? >>>> I can't tell you. I cannot test with compression not enabled. I do not >>>> have anough free space on this disk. >>> I had just sent two patches to fix false enospc error for compression, >>> please have a try, they fix false enospc error in my test environment. >>> btrfs: fix false enospc for compression >>> btrfs: improve inode's outstanding_extents computation >>> >>> I apply these two patchs in linux upstream tree, the latest commit >>> is 41844e36206be90cd4d962ea49b0abc3612a99d0. >>> >>> Regards, >>> Xiaoguang Wang >>> >>>>>>> I'm trying to fix it. >>>>>> That sounds good but do you also get the >>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>>>> >>>>>> kernel messages on umount? if not you might have found another >>>>>> problem. >>>>> Yes, I seem similar messages, you can paste you whole dmesg info here. >>>> [ cut here ] >>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >>>> btrfs_free_block_groups+0x346/0x430 [btrfs]() >>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >>>> 880fda777d00 813b69c3 >>>> c067a099 880fda777d38 810821c6 >>>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >>>> Call Trace: >>>> [] dump_stack+0x63/0x90 >>>> [] warn_slowpath_common+0x86/0xc0 >>>> [] warn_slowpath_null+0x1a/0x20 >>>> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >>>> [] close_ctree+0x15d/0x330 [btrfs] >>>> [] btrfs_put_super+0x19/0x20 [btrfs] >>>> [] generic_shutdown_super+0x6f/0x100 >>>> [] kill_anon_super+0x12/0x20 >>>> [] btrfs_kill_super+0x16/0xa0 [btrfs] >>>> [] deactivate_locked_super+0x43/0x70 >>>> [] deactivate_super+0x5c/0x60 >>>> [] cleanup_mnt+0x3f/0x90 >>>> [] __cleanup_mnt+0x12/0x20 >>>> [] task_work_run+0x81/0xa0 >>>> [] exit_to_usermode_loop+0xb0/0xc0 >>>> [] syscall_return_slowpath+0xd4/0x130 >>>> [] int_ret_from_sys_call+0x25/0x8f >>>> ---[ end trace cee6ace13018e13e ]--- >>>> [ cu
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
this is what atop shows at mem usage 5 minutes before the crash: MEM | tot62.8G | free 198.2M | cache 56.8G | buff1.4M | slab3.5G | shmem 1.1M | vmbal 0.0M | hptot 0.0M | SWP | tot 3.7G | free3.2G | | | | | vmcom 2.8G | vmlim 35.1G | Greets, Stefan Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang: > Hi, > > On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote: >> Dear Wang, >> >> can't use v4.8.0 as i always get OOMs and total machine crashes. >> >> Complete traces with your patch and some more btrfs patches applied (in >> the hope in fixes the OOM but it did not): >> http://pastebin.com/raw/6vmRSDm1 > I didn't see any such OOMs... > Can you try holger's tree with my patches. > > Regards, > Xiaoguang Wang >> >> Greets, >> Stefan >> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: >>> Hi, >>> >>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>>>> currently >>>>>> I cannot confirm that as i do not have anough space to test this >>>>>> without >>>>>> compression ;-( But yes i've compression enabled. >>>>> I might not get you, my poor english :) >>>>> You mean that you only get ENOSPC error when compression is enabled? >>>>> >>>>> And when compression is not enabled, you do not get ENOSPC error? >>>> I can't tell you. I cannot test with compression not enabled. I do not >>>> have anough free space on this disk. >>> I had just sent two patches to fix false enospc error for compression, >>> please have a try, they fix false enospc error in my test environment. >>> btrfs: fix false enospc for compression >>> btrfs: improve inode's outstanding_extents computation >>> >>> I apply these two patchs in linux upstream tree, the latest commit >>> is 41844e36206be90cd4d962ea49b0abc3612a99d0. >>> >>> Regards, >>> Xiaoguang Wang >>> >>>>>>> I'm trying to fix it. >>>>>> That sounds good but do you also get the >>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>>>> >>>>>> kernel messages on umount? if not you might have found another >>>>>> problem. >>>>> Yes, I seem similar messages, you can paste you whole dmesg info here. >>>> [ cut here ] >>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >>>> btrfs_free_block_groups+0x346/0x430 [btrfs]() >>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >>>> 880fda777d00 813b69c3 >>>> c067a099 880fda777d38 810821c6 >>>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >>>> Call Trace: >>>> [] dump_stack+0x63/0x90 >>>> [] warn_slowpath_common+0x86/0xc0 >>>> [] warn_slowpath_null+0x1a/0x20 >>>> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >>>> [] close_ctree+0x15d/0x330 [btrfs] >>>> [] btrfs_put_super+0x19/0x20 [btrfs] >>>> [] generic_shutdown_super+0x6f/0x100 >>>> [] kill_anon_super+0x12/0x20 >>>> [] btrfs_kill_super+0x16/0xa0 [btrfs] >>>> [] deactivate_locked_super+0x43/0x70 >>>> [] deactivate_super+0x5c/0x60 >>>> [] cleanup_mnt+0x3f/0x90 >>>> [] __cleanup_mnt+0x12/0x20 >>>> [] task_work_run+0x81/0xa0 >>>> [] exit_to_usermode_loop+0xb0/0xc0 >>>> [] syscall_return_slowpath+0xd4/0x130 >>>> []
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang: > Hi, > > On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote: >> Dear Wang, >> >> can't use v4.8.0 as i always get OOMs and total machine crashes. >> >> Complete traces with your patch and some more btrfs patches applied (in >> the hope in fixes the OOM but it did not): >> http://pastebin.com/raw/6vmRSDm1 > I didn't see any such OOMs... > Can you try holger's tree with my patches. Dear wang already tried that. Doesn't help. It also happens only on two out of three servers. It starts killing low men processes after time. But I've no idea where all those memory is consumed. (Have 64gb) Greets, Stefan > Regards, > Xiaoguang Wang >> >> Greets, >> Stefan >> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: >>> Hi, >>> >>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>>>> currently >>>>>> I cannot confirm that as i do not have anough space to test this >>>>>> without >>>>>> compression ;-( But yes i've compression enabled. >>>>> I might not get you, my poor english :) >>>>> You mean that you only get ENOSPC error when compression is enabled? >>>>> >>>>> And when compression is not enabled, you do not get ENOSPC error? >>>> I can't tell you. I cannot test with compression not enabled. I do not >>>> have anough free space on this disk. >>> I had just sent two patches to fix false enospc error for compression, >>> please have a try, they fix false enospc error in my test environment. >>> btrfs: fix false enospc for compression >>> btrfs: improve inode's outstanding_extents computation >>> >>> I apply these two patchs in linux upstream tree, the latest commit >>> is 41844e36206be90cd4d962ea49b0abc3612a99d0. >>> >>> Regards, >>> Xiaoguang Wang >>> >>>>>>> I'm trying to fix it. >>>>>> That sounds good but do you also get the >>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>>>> >>>>>> kernel messages on umount? if not you might have found another >>>>>> problem. >>>>> Yes, I seem similar messages, you can paste you whole dmesg info here. >>>> [ cut here ] >>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >>>> btrfs_free_block_groups+0x346/0x430 [btrfs]() >>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >>>> 880fda777d00 813b69c3 >>>> c067a099 880fda777d38 810821c6 >>>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >>>> Call Trace: >>>> [] dump_stack+0x63/0x90 >>>> [] warn_slowpath_common+0x86/0xc0 >>>> [] warn_slowpath_null+0x1a/0x20 >>>> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >>>> [] close_ctree+0x15d/0x330 [btrfs] >>>> [] btrfs_put_super+0x19/0x20 [btrfs] >>>> [] generic_shutdown_super+0x6f/0x100 >>>> [] kill_anon_super+0x12/0x20 >>>> [] btrfs_kill_super+0x16/0xa0 [btrfs] >>>> [] deactivate_locked_super+0x43/0x70 >>>> [] deactivate_super+0x5c/0x60 >>>> [] cleanup_mnt+0x3f/0x90 >>>> [] __cleanup_mnt+0x12/0x20 >>>> [] task_work_run+0x81/0xa0 >>>> [] exit_to_usermode_loop+0xb0/0xc0 >>>> [] syscall_return_slowpath+0xd4/0x130 >>>> [] int_ret_from_sys_call+0x25/0x8f >>>> ---[ end trace cee6ace13018e13e ]--- >>>> [ cut h
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Dear Wang, can't use v4.8.0 as i always get OOMs and total machine crashes. Complete traces with your patch and some more btrfs patches applied (in the hope in fixes the OOM but it did not): http://pastebin.com/raw/6vmRSDm1 Greets, Stefan Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: > Hi, > > On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>> currently >>>> I cannot confirm that as i do not have anough space to test this >>>> without >>>> compression ;-( But yes i've compression enabled. >>> I might not get you, my poor english :) >>> You mean that you only get ENOSPC error when compression is enabled? >>> >>> And when compression is not enabled, you do not get ENOSPC error? >> I can't tell you. I cannot test with compression not enabled. I do not >> have anough free space on this disk. > I had just sent two patches to fix false enospc error for compression, > please have a try, they fix false enospc error in my test environment. > btrfs: fix false enospc for compression > btrfs: improve inode's outstanding_extents computation > > I apply these two patchs in linux upstream tree, the latest commit > is 41844e36206be90cd4d962ea49b0abc3612a99d0. > > Regards, > Xiaoguang Wang > >> >>>>> I'm trying to fix it. >>>> That sounds good but do you also get the >>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> >>>> kernel messages on umount? if not you might have found another problem. >>> Yes, I seem similar messages, you can paste you whole dmesg info here. >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >> btrfs_free_block_groups+0x346/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >> [] warn_slowpath_null+0x1a/0x20 >> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >> [] close_ctree+0x15d/0x330 [btrfs] >> [] btrfs_put_super+0x19/0x20 [btrfs] >> [] generic_shutdown_super+0x6f/0x100 >> [] kill_anon_super+0x12/0x20 >> [] btrfs_kill_super+0x16/0xa0 [btrfs] >> [] deactivate_locked_super+0x43/0x70 >> [] deactivate_super+0x5c/0x60 >> [] cleanup_mnt+0x3f/0x90 >> [] __cleanup_mnt+0x12/0x20 >> [] task_work_run+0x81/0xa0 >> [] exit_to_usermode_loop+0xb0/0xc0 >> [] syscall_return_slowpath+0xd4/0x130 >> [] int_ret_from_sys_call+0x25/0x8f >> ---[ end trace cee6ace13018e13e ]--- >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 >> btrfs_free_block_groups+0x365/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 fff
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Thanks Wang, i applied them both on top of vanilla v4.8 - i hope this is OK. Will report back what happens. Greets, Stefan Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: > Hi, > > On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>> currently >>>> I cannot confirm that as i do not have anough space to test this >>>> without >>>> compression ;-( But yes i've compression enabled. >>> I might not get you, my poor english :) >>> You mean that you only get ENOSPC error when compression is enabled? >>> >>> And when compression is not enabled, you do not get ENOSPC error? >> I can't tell you. I cannot test with compression not enabled. I do not >> have anough free space on this disk. > I had just sent two patches to fix false enospc error for compression, > please have a try, they fix false enospc error in my test environment. > btrfs: fix false enospc for compression > btrfs: improve inode's outstanding_extents computation > > I apply these two patchs in linux upstream tree, the latest commit > is 41844e36206be90cd4d962ea49b0abc3612a99d0. > > Regards, > Xiaoguang Wang > >> >>>>> I'm trying to fix it. >>>> That sounds good but do you also get the >>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> >>>> kernel messages on umount? if not you might have found another problem. >>> Yes, I seem similar messages, you can paste you whole dmesg info here. >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >> btrfs_free_block_groups+0x346/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >> [] warn_slowpath_null+0x1a/0x20 >> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >> [] close_ctree+0x15d/0x330 [btrfs] >> [] btrfs_put_super+0x19/0x20 [btrfs] >> [] generic_shutdown_super+0x6f/0x100 >> [] kill_anon_super+0x12/0x20 >> [] btrfs_kill_super+0x16/0xa0 [btrfs] >> [] deactivate_locked_super+0x43/0x70 >> [] deactivate_super+0x5c/0x60 >> [] cleanup_mnt+0x3f/0x90 >> [] __cleanup_mnt+0x12/0x20 >> [] task_work_run+0x81/0xa0 >> [] exit_to_usermode_loop+0xb0/0xc0 >> [] syscall_return_slowpath+0xd4/0x130 >> [] int_ret_from_sys_call+0x25/0x8f >> ---[ end trace cee6ace13018e13e ]--- >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 >> btrfs_free_block_groups+0x365/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >>
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Thanks Wang, i applied them both on top of vanilla v4.8 - i hope this is OK. Will report back what happens. Greets, Stefan Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang: > Hi, > > On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote: >> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>> currently >>>> I cannot confirm that as i do not have anough space to test this >>>> without >>>> compression ;-( But yes i've compression enabled. >>> I might not get you, my poor english :) >>> You mean that you only get ENOSPC error when compression is enabled? >>> >>> And when compression is not enabled, you do not get ENOSPC error? >> I can't tell you. I cannot test with compression not enabled. I do not >> have anough free space on this disk. > I had just sent two patches to fix false enospc error for compression, > please have a try, they fix false enospc error in my test environment. > btrfs: fix false enospc for compression > btrfs: improve inode's outstanding_extents computation > > I apply these two patchs in linux upstream tree, the latest commit > is 41844e36206be90cd4d962ea49b0abc3612a99d0. > > Regards, > Xiaoguang Wang > >> >>>>> I'm trying to fix it. >>>> That sounds good but do you also get the >>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> >>>> kernel messages on umount? if not you might have found another problem. >>> Yes, I seem similar messages, you can paste you whole dmesg info here. >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 >> btrfs_free_block_groups+0x346/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >> [] warn_slowpath_null+0x1a/0x20 >> [] btrfs_free_block_groups+0x346/0x430 [btrfs] >> [] close_ctree+0x15d/0x330 [btrfs] >> [] btrfs_put_super+0x19/0x20 [btrfs] >> [] generic_shutdown_super+0x6f/0x100 >> [] kill_anon_super+0x12/0x20 >> [] btrfs_kill_super+0x16/0xa0 [btrfs] >> [] deactivate_locked_super+0x43/0x70 >> [] deactivate_super+0x5c/0x60 >> [] cleanup_mnt+0x3f/0x90 >> [] __cleanup_mnt+0x12/0x20 >> [] task_work_run+0x81/0xa0 >> [] exit_to_usermode_loop+0xb0/0xc0 >> [] syscall_return_slowpath+0xd4/0x130 >> [] int_ret_from_sys_call+0x25/0x8f >> ---[ end trace cee6ace13018e13e ]--- >> [ cut here ] >> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 >> btrfs_free_block_groups+0x365/0x430 [btrfs]() >> Modules linked in: netconsole xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm >> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan >> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop >> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit >> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd >> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid >> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 >> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 >> 880fda777d00 813b69c3 >> c067a099 880fda777d38 810821c6 >> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 >> Call Trace: >> [] dump_stack+0x63/0x90 >> [] warn_slowpath_common+0x86/0xc0 >>
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi, Am 29.09.2016 um 12:03 schrieb Adam Borowski: > On Thu, Sep 29, 2016 at 09:27:01AM +0200, Stefan Priebe - Profihost AG wrote: >> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>>>> currently >>>> I cannot confirm that as i do not have anough space to test this without >>>> compression ;-( But yes i've compression enabled. >>> I might not get you, my poor english :) >>> You mean that you only get ENOSPC error when compression is enabled? >>> >>> And when compression is not enabled, you do not get ENOSPC error? >> >> I can't tell you. I cannot test with compression not enabled. I do not >> have anough free space on this disk. > > Disabling compression doesn't immediately require any space -- it affects > only newly written data. What you already have remains in the old > compression setting, unless you defrag everything (a side effect of > defragging is switching existing extents to the new compression mode). Yes i know that but most workload is creating reflinks to old files and modify data in them. So to create a good test i need to defrag and uncompress all those files. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang: >>> I found that compress sometime report ENOSPC error even in 4.8-rc8, >>> currently >> I cannot confirm that as i do not have anough space to test this without >> compression ;-( But yes i've compression enabled. > I might not get you, my poor english :) > You mean that you only get ENOSPC error when compression is enabled? > > And when compression is not enabled, you do not get ENOSPC error? I can't tell you. I cannot test with compression not enabled. I do not have anough free space on this disk. >>> I'm trying to fix it. >> That sounds good but do you also get the >> BTRFS: space_info 4 has 18446742286429913088 free, is not full >> >> kernel messages on umount? if not you might have found another problem. > Yes, I seem similar messages, you can paste you whole dmesg info here. [ cut here ] WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790 btrfs_free_block_groups+0x346/0x430 [btrfs]() Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1 Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 880fda777d00 813b69c3 c067a099 880fda777d38 810821c6 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 Call Trace: [] dump_stack+0x63/0x90 [] warn_slowpath_common+0x86/0xc0 [] warn_slowpath_null+0x1a/0x20 [] btrfs_free_block_groups+0x346/0x430 [btrfs] [] close_ctree+0x15d/0x330 [btrfs] [] btrfs_put_super+0x19/0x20 [btrfs] [] generic_shutdown_super+0x6f/0x100 [] kill_anon_super+0x12/0x20 [] btrfs_kill_super+0x16/0xa0 [btrfs] [] deactivate_locked_super+0x43/0x70 [] deactivate_super+0x5c/0x60 [] cleanup_mnt+0x3f/0x90 [] __cleanup_mnt+0x12/0x20 [] task_work_run+0x81/0xa0 [] exit_to_usermode_loop+0xb0/0xc0 [] syscall_return_slowpath+0xd4/0x130 [] int_ret_from_sys_call+0x25/0x8f ---[ end trace cee6ace13018e13e ]--- [ cut here ] WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791 btrfs_free_block_groups+0x365/0x430 [btrfs]() Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 880fda777d00 813b69c3 c067a099 880fda777d38 810821c6 880074bf0a00 88103c10c088 88103c10c000 88103c10c098 Call Trace: [] dump_stack+0x63/0x90 [] warn_slowpath_common+0x86/0xc0 [] warn_slowpath_null+0x1a/0x20 [] btrfs_free_block_groups+0x365/0x430 [btrfs] [] close_ctree+0x15d/0x330 [btrfs] [] btrfs_put_super+0x19/0x20 [btrfs] [] generic_shutdown_super+0x6f/0x100 [] kill_anon_super+0x12/0x20 [] btrfs_kill_super+0x16/0xa0 [btrfs] [] deactivate_locked_super+0x43/0x70 [] deactivate_super+0x5c/0x60 [] cleanup_mnt+0x3f/0x90 [] __cleanup_mnt+0x12/0x20 [] task_work_run+0x81/0xa0 [] exit_to_usermode_loop+0xb0/0xc0 [] syscall_return_slowpath+0xd4/0x130 [] int_ret_from_sys_call+0x25/0x8f ---[ end trace cee6ace13018e13f ]--- [ cut here ] WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:10151 btrfs_free_block_groups+0x291/0x430 [btrfs]() Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1 Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 880fda777d00 813b69c3
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 29.09.2016 um 08:55 schrieb Wang Xiaoguang: > Hi, > > On 09/29/2016 02:49 PM, Stefan Priebe - Profihost AG wrote: >> Hi, >> >> Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang: >>> OK, I see. >>> But given that you often run into enospc errors, can you work out a >>> reproduce >>> script according to you work load. That will give us great help. > You got ENOSPC errors only when you have compress enabled? > > I found that compress sometime report ENOSPC error even in 4.8-rc8, > currently I cannot confirm that as i do not have anough space to test this without compression ;-( But yes i've compression enabled. > I'm trying to fix it. That sounds good but do you also get the BTRFS: space_info 4 has 18446742286429913088 free, is not full kernel messages on umount? if not you might have found another problem. Stefan > > Regards, > Xiaoguang Wang > >> I tried hard to reproduce it but i can't get it to reproduce with a test >> script. Any ideas? >> >> Stefan >> >>> Reagrds, >>> Xiaoguang Wang >>> >>>> Greets, >>>> Stefan >>>> >>>>> Regards, >>>>> Xiaoguang Wang >>>>>> Greets, >>>>>> Stefan >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>> linux-btrfs" in >>>>>> the body of a message to majord...@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>>> >>>>> >>> >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi, Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang: > OK, I see. > But given that you often run into enospc errors, can you work out a > reproduce > script according to you work load. That will give us great help. I tried hard to reproduce it but i can't get it to reproduce with a test script. Any ideas? Stefan > > Reagrds, > Xiaoguang Wang > >> >> Greets, >> Stefan >> >>> Regards, >>> Xiaoguang Wang Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 28.09.2016 um 15:44 schrieb Holger Hoffstätte: >> Good idea but it does not. I hope i can reproduce this with my already >> existing testscript which i've now bumped to use a 37TB partition and >> big files rather than a 15GB part and small files. If i can reproduce it >> i can also check whether disabling compression fixes this. > > Great. Remember to undo the compression on existing files, or create > them from scratch. I create files from scratch - but currently i can't trigger the problem with my testscript. But even in production load it's not that easy. I need to process 60-120 files before the error is triggered. >> No that's not the case. No rsync nor inplace is involved. I'm dumping >> differences directly from ceph and put them on top of a base image but >> only for 7 days. So it's not endless fragmenting the file. After 7 days >> a clean whole image is dumped. > > That sounds sane but it's also not at all how you described things to me > previosuly ;) But OK. I'm sorry. May be my english is just bad, you got me wrong or was drunk *joke*. It never changed. > How do you "dump differences directly from > Ceph"? I'd assume the VM images are RBDs, but it sounds you're somehow > using overlayfs. You can use rbd diff to export differences between two snapshots. So no overlayfs involved. > Anyway..something is off and you successfully cause it while other > people apparently do not. Sure - i know that. But i still don't want to switch to zfs. > Do you still use those nonstandard mount > options with extremely long transaction flush times? No i removed commit=300 just to be sure they do not cause this issue. Sure, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Dear Holger, first thanks for your long e-mail. Am 28.09.2016 um 14:47 schrieb Holger Hoffstätte: > On 09/28/16 13:35, Wang Xiaoguang wrote: >> hello, >> >> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >>> Dear list, >>> >>> is there any chance anybody wants to work with me on the following issue? >> Though I'm also somewhat new to btrfs, but I'd like to. >> >>> >>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >>> reserved=0, may_use=1808490201088, readonly=0 >>> >>> i get this nearly every day. >>> >>> Here are some msg collected from today and yesterday from different servers: >>> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >>> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >>> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >>> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >>> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >>> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >>> >>> What i tried so far without success: >>> - use vanilla 4.8-rc8 kernel >>> - use latest vanilla 4.4 kernel >>> - use latest 4.4 kernel + patches from holger hoffstaette > > Was that 4.4.22? It contains a patch by Goldwyn Rodrigues called > "Prevent qgroup->reserved from going subzero" which should prevent > this from happening. This should only affect filesystems with enabled > quota; you said you didn't have quota enabled, yet some quota-only > patches caused problems on your system (despite being scheduled for > 4.9 and apparently working fine everywhere else, even when I > specifically tested them *with* quota enabled). Yes this is 4.4.22 and no i don't have qgroups enabled so it can't help. # btrfs qgroup show /path/ ERROR: can't perform the search - No such file or directory ERROR: can't list qgroups: No such file or director This is the same output on all backup machines. > It means either: > - you tried my patchset for 4.4.21 (i.e. *without* the above patch) > and should bump to .22 right away No it's 4.4.22 > - you _do_ have qgroups enabled for some reason (systemd?) No see above - but yes i use systemd. > - your fs is corrupted and needs nuking If this is the case all FS on 5 servers must be corrupted and all of them were installed at a different date / year. The newest one just 5 month ago with kernel 4.1 the others with 3.18. Also a lot of other systems with just 100-900GB of space are working fine. > - you did something else entirely No idea what this could be. > There is also the chance that your use of compress-force (or rather > compression in general) causes leakage; compression runs asynchronously > and I wouldn't be surprised if that is still full of racy races..which > would be unfortunate, but you could try to disable compression for a > while and see what happens, assuming the space requirements allow this > experiment. Good idea but it does not. I hope i can reproduce this with my already existing testscript which i've now bumped to use a 37TB partition and big files rather than a 15GB part and small files. If i can reproduce it i can also check whether disabling compression fixes this. What speaks against this is that i've also a MariaDB Server which runs fine since two years with compress-force but uses only < 100GB files and also does not create and remove them on a daily basis. > You have also not told us whether this happens only on one (potentially > corrupted/confused) fs or on every one - my impression was that you have > several sharded backup filesystems/machines; not sure if that is still > the case. If it happens only on one specific fs chances are it's hosed. It happens on all of them - sorry if i missed this. >> I also met enospc error in 4.8-rc6 when doing big files create and delete >> tests, >> for my cases, I have written some patches to fix it. >> Would you please apply my patches to have a try: >> btrfs: try to satisfy metadata requests when every flush_space() returns >> btrfs: try to write enough delalloc bytes when reclaiming metadata space >> btrfs: make shrink_delalloc() try harder to reclaim metadata space > > These are all in my series for 4.4.22 and seem to work fine, however > Stefan's workload has nothing directly to do with big files; instead > it's the worst case scenario in terms of fragmentation (of huge files) and > a huge number of extents: incremental backups of VMs via rsync --inplace > with forced compression. No that's not the case. No rsync n
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang: > hello, > > On 09/28/2016 08:02 PM, Stefan Priebe - Profihost AG wrote: >> Hi Xiaoguang Wang, >> >> Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang: >>> hello, >>> >>> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >>>> Dear list, >>>> >>>> is there any chance anybody wants to work with me on the following >>>> issue? >>> Though I'm also somewhat new to btrfs, but I'd like to. >>> >>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >>>> reserved=0, may_use=1808490201088, readonly=0 >>>> >>>> i get this nearly every day. >>>> >>>> Here are some msg collected from today and yesterday from different >>>> servers: >>>> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >>>> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >>>> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >>>> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >>>> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >>>> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >>>> >>>> What i tried so far without success: >>>> - use vanilla 4.8-rc8 kernel >>>> - use latest vanilla 4.4 kernel >>>> - use latest 4.4 kernel + patches from holger hoffstaette >>>> - use clear_cache,space_cache=v2 >>>> - use clear_cache,space_cache=v1 >>>> >>>> But all tries result in ENOSPC after a short period of time doing >>>> backups. >>> I also met enospc error in 4.8-rc6 when doing big files create and >>> delete tests, >>> for my cases, I have written some patches to fix it. >>> Would you please apply my patches to have a try: >>> btrfs: try to satisfy metadata requests when every flush_space() returns >>> btrfs: try to write enough delalloc bytes when reclaiming metadata space >>> btrfs: make shrink_delalloc() try harder to reclaim metadata space >>> You can find them in btrfs mail list. >> those are already in the patchset from holger: >> >> So i have these in my testing patchset (latest 4.4 kernel + patches from >> holger hoffstaette): >> >> btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch >> >> >> btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch >> >> >> btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch >> > OK, I see. > But given that you often run into enospc errors, can you work out a > reproduce > script according to you work load. That will give us great help. I already tried that but it wasn't working. It seems i need a test device with +20TB and i need creating file that big in the tests. But that isn't easy. Currently i've no test hardware that big. May be i should try that on a production server. Stefan > Reagrds, > Xiaoguang Wang > >> >> Greets, >> Stefan >> >>> Regards, >>> Xiaoguang Wang >>>> Greets, >>>> Stefan >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe >>>> linux-btrfs" in >>>> the body of a message to majord...@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>> >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi Xiaoguang Wang, Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang: > hello, > > On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >> Dear list, >> >> is there any chance anybody wants to work with me on the following issue? > Though I'm also somewhat new to btrfs, but I'd like to. > >> >> BTRFS: space_info 4 has 18446742286429913088 free, is not full >> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >> reserved=0, may_use=1808490201088, readonly=0 >> >> i get this nearly every day. >> >> Here are some msg collected from today and yesterday from different >> servers: >> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >> >> What i tried so far without success: >> - use vanilla 4.8-rc8 kernel >> - use latest vanilla 4.4 kernel >> - use latest 4.4 kernel + patches from holger hoffstaette >> - use clear_cache,space_cache=v2 >> - use clear_cache,space_cache=v1 >> >> But all tries result in ENOSPC after a short period of time doing >> backups. > I also met enospc error in 4.8-rc6 when doing big files create and > delete tests, > for my cases, I have written some patches to fix it. > Would you please apply my patches to have a try: > btrfs: try to satisfy metadata requests when every flush_space() returns > btrfs: try to write enough delalloc bytes when reclaiming metadata space > btrfs: make shrink_delalloc() try harder to reclaim metadata space > You can find them in btrfs mail list. those are already in the patchset from holger: So i have these in my testing patchset (latest 4.4 kernel + patches from holger hoffstaette): btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch Greets, Stefan > > Regards, > Xiaoguang Wang >> >> Greets, >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS: space_info 4 has 18446742286429913088 free, is not full
Dear list, is there any chance anybody wants to work with me on the following issue? BTRFS: space_info 4 has 18446742286429913088 free, is not full BTRFS: space_info total=98247376896, used=77036814336, pinned=0, reserved=0, may_use=1808490201088, readonly=0 i get this nearly every day. Here are some msg collected from today and yesterday from different servers: | BTRFS: space_info 4 has 18446742182612910080 free, is not full | | BTRFS: space_info 4 has 18446742254739439616 free, is not full | | BTRFS: space_info 4 has 18446743980225085440 free, is not full | | BTRFS: space_info 4 has 18446743619906420736 free, is not full | | BTRFS: space_info 4 has 18446743647369576448 free, is not full | | BTRFS: space_info 4 has 18446742286429913088 free, is not full What i tried so far without success: - use vanilla 4.8-rc8 kernel - use latest vanilla 4.4 kernel - use latest 4.4 kernel + patches from holger hoffstaette - use clear_cache,space_cache=v2 - use clear_cache,space_cache=v1 But all tries result in ENOSPC after a short period of time doing backups. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: deadlock with btrfs heavy i/o and kswapd
Hi Chris, today i had this again. But i can't see any stack traces. I just see. INFO: kworker/u128:5:24301 blocked for more than 120 seconds. ... INFO: kworker/u128:5:24301 blocked for more than 120 seconds. ... INFO: task mysqld:929 blocked for more ... ... sysrq w just prints: sysrq: SysRq: Show Blcoked State but nothing more. Stefan Am 22.09.2016 um 16:28 schrieb Chris Mason: > > > On 09/22/2016 02:41 AM, Stefan Priebe - Profihost AG wrote: >> Hi, >> >> i always encounter btrfs deadlocks / hung tasks, when i have a lot of >> cached mem and i'm doing heavy rsync --inplace operations in my system >> from btrfs zlib compressed disk A to btrfs zlib compressed disk B. >> >> The last output i see in this case is kswapd0 running for a long time at >> 100% cpu. Then the whole system get's stuck. I cannot connect to ssh >> anymore but the kernel still prints hung tasks every few minutes. >> >> May be relevant the system has NO swap. >> >> vm.vfs_cache_pressure = 100 >> vm.swappiness = 50 > > Are you able to capture the stack dumps? A sysrq-w would really help. > > -chris > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ENOSPACE linux 4.8-rc6 BTRFS: space_info 4 has 18446743524878843904 free, is not full
Hi, this is vanilla linux 4.8-rc6 and i still have ENOSPC issues with btrfs - caused by wrong space_tree entries. [ 9736.921995] [ cut here ] [ 9736.923342] WARNING: CPU: 1 PID: 23942 at fs/btrfs/extent-tree.c:5734 btrfs_free_block_groups+0x35e/0x440 [btrfs] [ 9736.926229] Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel kvm ipmi_si irqbypass i2c_i801 crc32_pclmul i2c_smbus shpchp ghash_clmulni_intel ipmi_msghandler button loop btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod usbhid xhci_pci igb ehci_pci i2c_algo_bit xhci_hcd ehci_hcd i40e i2c_core ahci usbcore ptp usb_common libahci aacraid pps_core [ 9736.941228] CPU: 1 PID: 23942 Comm: umount Not tainted 4.8.0-rc6 #6 [ 9736.943497] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 [ 9736.945561] 8a571d3d3cf8 a33de0b3 [ 9736.947720] 8a571d3d3d38 a3084e01 1666a30e2499 [ 9736.949880] 8a577b216088 8a577b216000 8a577aa17200 [ 9736.952043] Call Trace: [ 9736.952711] [] dump_stack+0x63/0x90 [ 9736.954139] [] __warn+0xd1/0xf0 [ 9736.955466] [] warn_slowpath_null+0x1d/0x20 [ 9737.022831] [] btrfs_free_block_groups+0x35e/0x440 [btrfs] [ 9737.091125] [] close_ctree+0x15d/0x340 [btrfs] [ 9737.159547] [] btrfs_put_super+0x19/0x20 [btrfs] [ 9737.227648] [] generic_shutdown_super+0x6f/0x100 [ 9737.295227] [] kill_anon_super+0x12/0x20 [ 9737.362199] [] btrfs_kill_super+0x16/0xa0 [btrfs] [ 9737.428716] [] deactivate_locked_super+0x43/0x70 [ 9737.494608] [] deactivate_super+0x5c/0x60 [ 9737.559338] [] cleanup_mnt+0x3f/0x90 [ 9737.623414] [] __cleanup_mnt+0x12/0x20 [ 9737.687439] [] task_work_run+0x7e/0xa0 [ 9737.750376] [] exit_to_usermode_loop+0xb0/0xc0 [ 9737.813436] [] do_syscall_64+0x189/0x1f0 [ 9737.875948] [] entry_SYSCALL64_slow_path+0x25/0x25 [ 9737.938449] ---[ end trace 767418320c59f391 ]--- [ 9738.000649] [ cut here ] [ 9738.062721] WARNING: CPU: 1 PID: 23942 at fs/btrfs/extent-tree.c:5735 btrfs_free_block_groups+0x37d/0x440 [btrfs] [ 9738.128037] Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel kvm ipmi_si irqbypass i2c_i801 crc32_pclmul i2c_smbus shpchp ghash_clmulni_intel ipmi_msghandler button loop btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod usbhid xhci_pci igb ehci_pci i2c_algo_bit xhci_hcd ehci_hcd i40e i2c_core ahci usbcore ptp usb_common libahci aacraid pps_core [ 9738.487487] CPU: 1 PID: 23942 Comm: umount Tainted: GW 4.8.0-rc6 #6 [ 9738.564325] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 [ 9738.641240] 8a571d3d3cf8 a33de0b3 [ 9738.719075] 8a571d3d3d38 a3084e01 1667a30e2499 [ 9738.796597] 8a577b216088 8a577b216000 8a577aa17200 [ 9738.874120] Call Trace: [ 9738.950936] [] dump_stack+0x63/0x90 [ 9739.028733] [] __warn+0xd1/0xf0 [ 9739.106261] [] warn_slowpath_null+0x1d/0x20 [ 9739.184672] [] btrfs_free_block_groups+0x37d/0x440 [btrfs] [ 9739.263666] [] close_ctree+0x15d/0x340 [btrfs] [ 9739.341059] [] btrfs_put_super+0x19/0x20 [btrfs] [ 9739.416799] [] generic_shutdown_super+0x6f/0x100 [ 9739.490619] [] kill_anon_super+0x12/0x20 [ 9739.562240] [] btrfs_kill_super+0x16/0xa0 [btrfs] [ 9739.632301] [] deactivate_locked_super+0x43/0x70 [ 9739.700692] [] deactivate_super+0x5c/0x60 [ 9739.767536] [] cleanup_mnt+0x3f/0x90 [ 9739.833250] [] __cleanup_mnt+0x12/0x20 [ 9739.898076] [] task_work_run+0x7e/0xa0 [ 9739.962977] [] exit_to_usermode_loop+0xb0/0xc0 [ 9740.028080] [] do_syscall_64+0x189/0x1f0 [ 9740.092352] [] entry_SYSCALL64_slow_path+0x25/0x25 [ 9740.156898] ---[ end trace 767418320c59f392 ]--- [ 9740.221712] [ cut here ] [ 9740.286589] WARNING: CPU: 1 PID: 23942 at fs/btrfs/extent-tree.c:10062 btrfs_free_block_groups+0x2a9/0x440 [btrfs] [ 9740.354299] Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel kvm ipmi_si irqbypass i2c_i801 crc32_pclmul i2c_smbus shpchp ghash_clmulni_intel ipmi_msghandler button loop btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod usbhid xhci_pci igb ehci_pci i2c_algo_bit xhci_hcd ehci_hcd i40e i2c_core ahci usbcore ptp usb_common libahci aacraid pps_core [ 9740.720097] CPU: 1 PID: 23942 Comm: umount Tainted: GW 4.8.0-rc6 #6 [ 9740.797412]
btrfs_endio_write_helper hard lock blocked qorkqueues with kernel 4.8-rc5
Hi, today i've seen this one with 4.8-rc5 and the system was going to be unresponsible. BUG: workqueue lockup - pool cpus=14 node=1 flags=0x0 nice=0 stuck for 33s! BUG: workqueue lockup - pool cpus=14 node=1 flags=0x0 nice=-20 stuck for 33s! Showing busy workqueues and worker pools: workqueue kblockd: flags=0x18 pwq 29: cpus=14 node=1 flags=0x0 nice=-20 active=9/256 pending: cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue workqueue vmstat: flags=0xc pwq 28: cpus=14 node=1 flags=0x0 nice=0 active=1/256 pending: vmstat_update workqueue btrfs-endio-write: flags=0xe pwq 66: cpus=8-15,24-31 node=1 flags=0x4 nice=0 active=8/8 in-flight: 12942:btrfs_endio_write_helper [btrfs], 11348:btrfs_endio_write_helper [btrfs], 11350:btrfs_endio_write_helper [btrfs], 5472:btrfs_endio_write_helper [btrfs], 3277:btrfs_endio_write_helper [btrfs], 13523:btrfs_endio_write_helper [btrfs], 5477:btrfs_endio_write_helper [btrfs], 5471:btrfs_endio_write_helper [btrfs] delayed: btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
Re: [PATCH v3] btrfs: should block unused block groups deletion work when allocating data space
Thanks, this one works fine. No deadlocks. Stefan Am 09.09.2016 um 10:17 schrieb Wang Xiaoguang: > cleaner_kthread() may run at any time, in which it'll call > btrfs_delete_unused_bgs() > to delete unused block groups. Because this work is asynchronous, it may also > result > in false ENOSPC error. Please see below race window: > >CPU1 | CPU2 > | > |-> btrfs_alloc_data_chunk_ondemand() |-> cleaner_kthread() > |-> do_chunk_alloc() | | > | assume it returns ENOSPC, which means | | > | btrfs_space_info is full and have free| | > | space to satisfy data request.| | > | | |- > > btrfs_delete_unused_bgs() > | | |it will decrease > btrfs_space_info > | | |total_bytes and make > | | |btrfs_space_info is > not full. > | | | > In this case, we may get ENOSPC error, but btrfs_space_info is not full. > > To fix this issue, in btrfs_alloc_data_chunk_ondemand(), if we need to call > do_chunk_alloc() to allocating new chunk, we should block > btrfs_delete_unused_bgs(). > Here we introduce a new struct rw_semaphore bg_delete_sem to do this job. > > Indeed there is already a "struct mutex delete_unused_bgs_mutex", but it's > mutex, > we can not use it for this purpose. Of course, we can re-define it to be > struct > rw_semaphore, then use it in btrfs_alloc_data_chunk_ondemand(). Either method > will > work. > > But given that delete_unused_bgs_mutex's name length is longer than > bg_delete_sem, > I choose the first method, to create a new struct rw_semaphore bg_delete_sem > and > delete delete_unused_bgs_mutex :) > > Reported-by: Stefan Priebe <s.pri...@profihost.ag> > Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com> > --- > V2: fix a deadlock revealed by fstests case btrfs/071, we call > start_transaction() before in down_write(bg_delete_sem) in > btrfs_delete_unused_bgs(). > > v3: Stefan Priebe reported another similar deadlock, so here we choose > to not call down_read(bg_delete_sem) for free space inode in > btrfs_alloc_data_chunk_ondemand(). Meanwhile because we only do the > data space reservation for free space cache in the transaction context, > btrfs_delete_unused_bgs() will either have finished its job, or start > a new transaction waiting current transaction to complete, there will > be no unused block groups to be deleted, so it's safe to not call > down_read(bg_delete_sem) > --- > --- > fs/btrfs/ctree.h | 2 +- > fs/btrfs/disk-io.c | 13 +-- > fs/btrfs/extent-tree.c | 59 > -- > fs/btrfs/volumes.c | 42 +-- > 4 files changed, 76 insertions(+), 40 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index eff3993..fa78ef9 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -788,6 +788,7 @@ struct btrfs_fs_info { > struct mutex cleaner_mutex; > struct mutex chunk_mutex; > struct mutex volume_mutex; > + struct rw_semaphore bg_delete_sem; > > /* >* this is taken to make sure we don't set block groups ro after > @@ -1068,7 +1069,6 @@ struct btrfs_fs_info { > spinlock_t unused_bgs_lock; > struct list_head unused_bgs; > struct mutex unused_bg_unpin_mutex; > - struct mutex delete_unused_bgs_mutex; > > /* For btrfs to record security options */ > struct security_mnt_opts security_opts; > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 54bc8c7..3cdbd05 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -1868,12 +1868,11 @@ static int cleaner_kthread(void *arg) > btrfs_run_defrag_inodes(root->fs_info); > > /* > - * Acquires fs_info->delete_unused_bgs_mutex to avoid racing > - * with relocation (btrfs_relocate_chunk) and relocation > - * acquires fs_info->cleaner_mutex (btrfs_relocate_block_group) > - * after acquiring fs_info->delete_unused_bgs_mutex. So we > - * can't hold, nor need to, fs_info->cleaner_mutex when deleting > - * unused block groups. > + * Acquires fs_info->bg_delete_sem to avoid racing with > + * relocation (btrfs_relo
Re: btrfs and systemd
Am 29.08.2016 um 13:33 schrieb Timofey Titovets: > Do you try: nofail,noauto,x-systemd.automount ? sure this fails too as it has the same timeout in systemd. Mr. Poettering has recommanded me todo the following: # mkdir -p /etc/systemd/system/$(systemd-escape --suffix=mount -p /foo/bar/baz).d/ # cat > /etc/systemd/system/$(systemd-escape --suffix=mount -p /foo/bar/baz).d/timeout.conf < 2016-08-29 9:28 GMT+03:00 Stefan Priebe - Profihost AG > <s.pri...@profihost.ag>: >> Hi Qu, >> >> Am 29.08.2016 um 03:48 schrieb Qu Wenruo: >>> >>> >>> At 08/29/2016 04:15 AM, Stefan Priebe - Profihost AG wrote: >>>> Hi, >>>> >>>> i'm trying to get my 60TB btrfs volume to mount with systemd at boot. >>>> But this always fails with: "mounting timed out. Stopping." after 90s. >>> >>> 60TB is quite large, and under most case it will already cause mount >>> speed problem. >>> >>> In our test environment, filling a fs with 16K small files to 2T (just >>> 128K files)will already slow the mount process to 10s. >>> >>> For larger fs, or more specifically, large extent tree, will slow the >>> mount process obviously. >>> >>> The root fix will need a rework of extent tree. >>> AFAIK Josef is working on the rework. >>> >>> So the btrfs fix will need some time. >> >> thanks but i've no problem with the long mount time (in my case 6 >> minutes) i'm just wondering how to live with it with systemd. As it >> always cancels the mount process after 90s and i see no fstab option to >> change this. >> >> Greets, >> Stefan >> >>> >>> Thanks, >>> Qu >>>> >>>> I can't find any fstab setting for systemd to higher this timeout. >>>> There's just the x-systemd.device-timeout but this controls how long to >>>> wait for the device and not for the mount command. >>>> >>>> Is there any solution for big btrfs volumes and systemd? >>>> >>>> Greets, >>>> Stefan >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majord...@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Hi Josef, this still hapens with current 4.8-rc* releases. Anything i can do to debug this? May be insert some code to check for an under or overflow in the code? Stefan Am 14.08.2016 um 17:22 schrieb Stefan Priebe - Profihost AG: > Hi Josef, > > anything i could do or test? Results with a vanilla next branch are the > same. > > Stefan > > Am 11.08.2016 um 08:09 schrieb Stefan Priebe - Profihost AG: >> Hello, >> >> the backtrace and info on umount looks the same: >> >> [241910.341124] [ cut here ] >> [241910.379991] WARNING: CPU: 1 PID: 26664 at >> fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs] >> [241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT >> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp >> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci >> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si >> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod >> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas >> [241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted >> 4.7.0-rc6-29043-g8b8b08c #1 >> [241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c >> 02/18/2015 >> [241910.723716] 8808d104bca8 bd3d83cf >> >> [241910.779309] 8808d104bcf8 bd085615 >> 8808d104bd08 >> [241910.835143] 16455a3410a8 0047a000 >> 8808469e2088 >> [241910.891882] Call Trace: >> [241910.947624] [] dump_stack+0x63/0x84 >> [241911.003714] [] __warn+0xe5/0x100 >> [241911.060167] [] warn_slowpath_null+0x1d/0x20 >> [241911.117422] [] >> btrfs_free_block_groups+0x370/0x410 [btrfs] >> [241911.175975] [] close_ctree+0x15b/0x330 [btrfs] >> [241911.235170] [] btrfs_put_super+0x19/0x20 [btrfs] >> [241911.294638] [] generic_shutdown_super+0x6f/0x100 >> [241911.353005] [] kill_anon_super+0x16/0x30 >> [241911.409832] [] btrfs_kill_super+0x1a/0xb0 [btrfs] >> [241911.466467] [] deactivate_locked_super+0x51/0x90 >> [241911.522602] [] deactivate_super+0x4e/0x70 >> [241911.577979] [] cleanup_mnt+0x43/0x90 >> [241911.633188] [] __cleanup_mnt+0x12/0x20 >> [241911.688146] [] task_work_run+0x81/0xb0 >> [241911.742740] [] exit_to_usermode_loop+0x66/0x95 >> [241911.797039] [] do_syscall_64+0x10d/0x150 >> [241911.850750] [] entry_SYSCALL64_slow_path+0x25/0x25 >> [241911.903564] ---[ end trace fae017546778f2b0 ]--- >> [241911.955332] [ cut here ] >> [241912.006262] WARNING: CPU: 1 PID: 26664 at >> fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs] >> [241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT >> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp >> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci >> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si >> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod >> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas >> [241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: GW >> 4.7.0-rc6-29043-g8b8b08c #1 >> [241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c >> 02/18/2015 >> [241912.429395] 8808d104bca8 bd3d83cf >> >> [241912.497080] 8808d104bcf8 bd085615 >> 8808d104bd08 >> [241912.565113] 16465a3410a8 0047a000 >> 8808469e2088 >> [241912.634105] Call Trace: >> [241912.702992] [] dump_stack+0x63/0x84 >> [241912.773473] [] __warn+0xe5/0x100 >> [241912.844339] [] warn_slowpath_null+0x1d/0x20 >> [241912.916083] [] >> btrfs_free_block_groups+0x40a/0x410 [btrfs] >> [241912.989103] [] close_ctree+0x15b/0x330 [btrfs] >> [241913.062672] [] btrfs_put_super+0x19/0x20 [btrfs] >> [241913.136364] [] generic_shutdown_super+0x6f/0x100 >> [241913.208701] [] kill_anon_super+0x16/0x30 >> [241913.279194] [] btrfs_kill_super+0x1a/0xb0 [btrfs] >> [241913.348065] [] deactivate_locked_super+0x51/0x90 >> [241913.415082] [] deactivate_super+0x4e/0x70 >> [241913.479841] [] cleanup_mnt+0x43/0x90 >> [241913.543353] [] __cleanup_mnt+0x12/0x20 >> [241913.605959] [] task_work_run+0x81/0xb0 >> [241913.667542] [] exit_t
Re: btrfs and systemd
Hi Qu, Am 29.08.2016 um 03:48 schrieb Qu Wenruo: > > > At 08/29/2016 04:15 AM, Stefan Priebe - Profihost AG wrote: >> Hi, >> >> i'm trying to get my 60TB btrfs volume to mount with systemd at boot. >> But this always fails with: "mounting timed out. Stopping." after 90s. > > 60TB is quite large, and under most case it will already cause mount > speed problem. > > In our test environment, filling a fs with 16K small files to 2T (just > 128K files)will already slow the mount process to 10s. > > For larger fs, or more specifically, large extent tree, will slow the > mount process obviously. > > The root fix will need a rework of extent tree. > AFAIK Josef is working on the rework. > > So the btrfs fix will need some time. thanks but i've no problem with the long mount time (in my case 6 minutes) i'm just wondering how to live with it with systemd. As it always cancels the mount process after 90s and i see no fstab option to change this. Greets, Stefan > > Thanks, > Qu >> >> I can't find any fstab setting for systemd to higher this timeout. >> There's just the x-systemd.device-timeout but this controls how long to >> wait for the device and not for the mount command. >> >> Is there any solution for big btrfs volumes and systemd? >> >> Greets, >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs and systemd
Hi, i'm trying to get my 60TB btrfs volume to mount with systemd at boot. But this always fails with: "mounting timed out. Stopping." after 90s. I can't find any fstab setting for systemd to higher this timeout. There's just the x-systemd.device-timeout but this controls how long to wait for the device and not for the mount command. Is there any solution for big btrfs volumes and systemd? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Hi Josef, anything i could do or test? Results with a vanilla next branch are the same. Stefan Am 11.08.2016 um 08:09 schrieb Stefan Priebe - Profihost AG: > Hello, > > the backtrace and info on umount looks the same: > > [241910.341124] [ cut here ] > [241910.379991] WARNING: CPU: 1 PID: 26664 at > fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs] > [241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT > raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp > iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci > i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si > ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov > async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod > ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas > [241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted > 4.7.0-rc6-29043-g8b8b08c #1 > [241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c > 02/18/2015 > [241910.723716] 8808d104bca8 bd3d83cf > > [241910.779309] 8808d104bcf8 bd085615 > 8808d104bd08 > [241910.835143] 16455a3410a8 0047a000 > 8808469e2088 > [241910.891882] Call Trace: > [241910.947624] [] dump_stack+0x63/0x84 > [241911.003714] [] __warn+0xe5/0x100 > [241911.060167] [] warn_slowpath_null+0x1d/0x20 > [241911.117422] [] > btrfs_free_block_groups+0x370/0x410 [btrfs] > [241911.175975] [] close_ctree+0x15b/0x330 [btrfs] > [241911.235170] [] btrfs_put_super+0x19/0x20 [btrfs] > [241911.294638] [] generic_shutdown_super+0x6f/0x100 > [241911.353005] [] kill_anon_super+0x16/0x30 > [241911.409832] [] btrfs_kill_super+0x1a/0xb0 [btrfs] > [241911.466467] [] deactivate_locked_super+0x51/0x90 > [241911.522602] [] deactivate_super+0x4e/0x70 > [241911.577979] [] cleanup_mnt+0x43/0x90 > [241911.633188] [] __cleanup_mnt+0x12/0x20 > [241911.688146] [] task_work_run+0x81/0xb0 > [241911.742740] [] exit_to_usermode_loop+0x66/0x95 > [241911.797039] [] do_syscall_64+0x10d/0x150 > [241911.850750] [] entry_SYSCALL64_slow_path+0x25/0x25 > [241911.903564] ---[ end trace fae017546778f2b0 ]--- > [241911.955332] [ cut here ] > [241912.006262] WARNING: CPU: 1 PID: 26664 at > fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs] > [241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT > raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp > iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci > i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si > ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov > async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod > ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas > [241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: GW > 4.7.0-rc6-29043-g8b8b08c #1 > [241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c > 02/18/2015 > [241912.429395] 8808d104bca8 bd3d83cf > > [241912.497080] 8808d104bcf8 bd085615 > 8808d104bd08 > [241912.565113] 16465a3410a8 0047a000 > 8808469e2088 > [241912.634105] Call Trace: > [241912.702992] [] dump_stack+0x63/0x84 > [241912.773473] [] __warn+0xe5/0x100 > [241912.844339] [] warn_slowpath_null+0x1d/0x20 > [241912.916083] [] > btrfs_free_block_groups+0x40a/0x410 [btrfs] > [241912.989103] [] close_ctree+0x15b/0x330 [btrfs] > [241913.062672] [] btrfs_put_super+0x19/0x20 [btrfs] > [241913.136364] [] generic_shutdown_super+0x6f/0x100 > [241913.208701] [] kill_anon_super+0x16/0x30 > [241913.279194] [] btrfs_kill_super+0x1a/0xb0 [btrfs] > [241913.348065] [] deactivate_locked_super+0x51/0x90 > [241913.415082] [] deactivate_super+0x4e/0x70 > [241913.479841] [] cleanup_mnt+0x43/0x90 > [241913.543353] [] __cleanup_mnt+0x12/0x20 > [241913.605959] [] task_work_run+0x81/0xb0 > [241913.667542] [] exit_to_usermode_loop+0x66/0x95 > [241913.729612] [] do_syscall_64+0x10d/0x150 > [241913.791203] [] entry_SYSCALL64_slow_path+0x25/0x25 > [241913.852485] ---[ end trace fae017546778f2b1 ]--- > [241913.913638] [ cut here ] > [241913.974871] WARNING: CPU: 1 PID: 26664 at > fs/btrfs/extent-tree.c:10013 btrfs_free_block_groups+0x2ba/0x410 [btrfs] > [241914.039315] Modules linked in: netconsole mpt3sas ipt_REJECT > raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp > iptable_filter ip_tables x_tables bonding coretemp loop usbhid e
Re: memory overflow or undeflow in free space tree / space_info?
[241914.607523] 271dbd3dac8c 88085184aac8 0038 [241914.681318] Call Trace: [241914.754437] [] dump_stack+0x63/0x84 [241914.828796] [] __warn+0xe5/0x100 [241914.902953] [] warn_slowpath_null+0x1d/0x20 [241914.977271] [] btrfs_free_block_groups+0x2ba/0x410 [btrfs] [241915.052041] [] close_ctree+0x15b/0x330 [btrfs] [241915.126282] [] btrfs_put_super+0x19/0x20 [btrfs] [241915.200758] [] generic_shutdown_super+0x6f/0x100 [241915.273872] [] kill_anon_super+0x16/0x30 [241915.345132] [] btrfs_kill_super+0x1a/0xb0 [btrfs] [241915.414703] [] deactivate_locked_super+0x51/0x90 [241915.482488] [] deactivate_super+0x4e/0x70 [241915.547994] [] cleanup_mnt+0x43/0x90 [241915.611962] [] __cleanup_mnt+0x12/0x20 [241915.674717] [] task_work_run+0x81/0xb0 [241915.736398] [] exit_to_usermode_loop+0x66/0x95 [241915.798592] [] do_syscall_64+0x10d/0x150 [241915.860295] [] entry_SYSCALL64_slow_path+0x25/0x25 [241915.921642] ---[ end trace fae017546778f2b2 ]--- [241915.982893] BTRFS: space_info 4 has 114577997824 free, is not full [241916.045103] BTRFS: space_info total=307627032576, used=193048903680, pinned=0, reserved=0, may_use=688537059328, readonly=131072 Greets, Stefan Am 10.08.2016 um 23:31 schrieb Stefan Priebe - Profihost AG: > Hi Josef, > > same again with chris next branch: > > ERROR: error during balancing '/vmbackup/': No space left on device > There may be more info in syslog - try dmesg | tail > Dumping filters: flags 0x7, state 0x0, force is off > DATA (flags 0x2): balancing, usage=5 > METADATA (flags 0x2): balancing, usage=5 > SYSTEM (flags 0x2): balancing, usage=5 > > dmesg: > [203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance > > uname -r 4.7.0-rc6-29043-g8b8b08c > > Greets, > Stefan > > Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG: >> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG: >>> Am 29.07.2016 um 23:03 schrieb Josef Bacik: >>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote: >>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote: >>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost >>>>>> AG wrote: >>>>>>> Dear list, >>>>>>> >>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (> >>>>>>> 30TB). >>>>>>> >>>>>>> In all cases i'm getting a trace like this one a space_info warning. >>>>>>> (since commit [1]). Could someone please be so kind and help me >>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those >>>>>>> systems. >>>>>> >>>>>> Hm, so I think this indicates a bug in space accounting somewhere else >>>>>> rather than the free space tree itself. I haven't debugged one of these >>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too. >>>>> >>>>> I should've asked, what sort of filesystem activity triggers this? >>>>> >>>> >>>> Chris just fixed this I think, try his next branch from his git tree >>>> >>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git >>> >>> Thanks now running a 4.4 with those patches backported. If that still >>> shows an error i will try that vanilla tree. >> >> OK this didn't work. I'll start / try using the linux-btrfs next branch >> and look if this helps. >> >> Greets, >> Stefan >> >>> >>> Thanks! >>> >>> Stefan >>> >>>> and see if it still happens. Thanks, >>>> >>>> Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Hi Josef, same again with chris next branch: ERROR: error during balancing '/vmbackup/': No space left on device There may be more info in syslog - try dmesg | tail Dumping filters: flags 0x7, state 0x0, force is off DATA (flags 0x2): balancing, usage=5 METADATA (flags 0x2): balancing, usage=5 SYSTEM (flags 0x2): balancing, usage=5 dmesg: [203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance uname -r 4.7.0-rc6-29043-g8b8b08c Greets, Stefan Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG: > Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG: >> Am 29.07.2016 um 23:03 schrieb Josef Bacik: >>> On 07/29/2016 03:14 PM, Omar Sandoval wrote: >>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote: >>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost >>>>> AG wrote: >>>>>> Dear list, >>>>>> >>>>>> i'm seeing btrfs no space messages frequently on big filesystems (> >>>>>> 30TB). >>>>>> >>>>>> In all cases i'm getting a trace like this one a space_info warning. >>>>>> (since commit [1]). Could someone please be so kind and help me >>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those >>>>>> systems. >>>>> >>>>> Hm, so I think this indicates a bug in space accounting somewhere else >>>>> rather than the free space tree itself. I haven't debugged one of these >>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too. >>>> >>>> I should've asked, what sort of filesystem activity triggers this? >>>> >>> >>> Chris just fixed this I think, try his next branch from his git tree >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git >> >> Thanks now running a 4.4 with those patches backported. If that still >> shows an error i will try that vanilla tree. > > OK this didn't work. I'll start / try using the linux-btrfs next branch > and look if this helps. > > Greets, > Stefan > >> >> Thanks! >> >> Stefan >> >>> and see if it still happens. Thanks, >>> >>> Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG: > Am 29.07.2016 um 23:03 schrieb Josef Bacik: >> On 07/29/2016 03:14 PM, Omar Sandoval wrote: >>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote: >>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost >>>> AG wrote: >>>>> Dear list, >>>>> >>>>> i'm seeing btrfs no space messages frequently on big filesystems (> >>>>> 30TB). >>>>> >>>>> In all cases i'm getting a trace like this one a space_info warning. >>>>> (since commit [1]). Could someone please be so kind and help me >>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those >>>>> systems. >>>> >>>> Hm, so I think this indicates a bug in space accounting somewhere else >>>> rather than the free space tree itself. I haven't debugged one of these >>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too. >>> >>> I should've asked, what sort of filesystem activity triggers this? >>> >> >> Chris just fixed this I think, try his next branch from his git tree >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > > Thanks now running a 4.4 with those patches backported. If that still > shows an error i will try that vanilla tree. OK this didn't work. I'll start / try using the linux-btrfs next branch and look if this helps. Greets, Stefan > > Thanks! > > Stefan > >> and see if it still happens. Thanks, >> >> Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Am 29.07.2016 um 23:03 schrieb Josef Bacik: > On 07/29/2016 03:14 PM, Omar Sandoval wrote: >> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote: >>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost >>> AG wrote: >>>> Dear list, >>>> >>>> i'm seeing btrfs no space messages frequently on big filesystems (> >>>> 30TB). >>>> >>>> In all cases i'm getting a trace like this one a space_info warning. >>>> (since commit [1]). Could someone please be so kind and help me >>>> debugging / fixing this bug? I'm using space_cache=v2 on all those >>>> systems. >>> >>> Hm, so I think this indicates a bug in space accounting somewhere else >>> rather than the free space tree itself. I haven't debugged one of these >>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too. >> >> I should've asked, what sort of filesystem activity triggers this? >> > > Chris just fixed this I think, try his next branch from his git tree > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git Thanks now running a 4.4 with those patches backported. If that still shows an error i will try that vanilla tree. Thanks! Stefan > and see if it still happens. Thanks, > > Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Am 29.07.2016 um 21:14 schrieb Omar Sandoval: > On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote: >> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote: >>> Dear list, >>> >>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB). >>> >>> In all cases i'm getting a trace like this one a space_info warning. >>> (since commit [1]). Could someone please be so kind and help me >>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems. >> >> Hm, so I think this indicates a bug in space accounting somewhere else >> rather than the free space tree itself. I haven't debugged one of these >> issues before, I'll see if I can reproduce it. Cc'ing Josef, too. > > I should've asked, what sort of filesystem activity triggers this? > Sure. The workload on the FS is basically: - Write file1 (50GB - 500GB) - cp --reflink=always file1 to file2 - apply changes to file2 (100MB - 5GB) - cp --reflink=always file2 to file3 - apply changes to file3 (100MB - 5GB) ... - delete file1 - cp --reflink=always file3 to file4 - apply changes to file4 (100MB - 5GB) - delete file2 ... And this for around 300 files a day. btrfs balance with dusage=5 and musage=5 is running daily sometimes in parallel to the workload above. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory overflow or undeflow in free space tree / space_info?
Am 29.07.2016 um 21:11 schrieb Omar Sandoval: > On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote: >> Dear list, >> >> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB). >> >> In all cases i'm getting a trace like this one a space_info warning. >> (since commit [1]). Could someone please be so kind and help me >> debugging / fixing this bug? I'm using space_cache=v2 on all those systems. > > Hm, so I think this indicates a bug in space accounting somewhere else > rather than the free space tree itself. I haven't debugged one of these > issues before, I'll see if I can reproduce it. Cc'ing Josef, too. Thanks. >> [ cut here ] >> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710 > > Do these line numbers match up with yours? > > 5706static void release_global_block_rsv(struct btrfs_fs_info > *fs_info) > 5707{ > 5708block_rsv_release_bytes(fs_info, > _info->global_block_rsv, NULL, > 5709(u64)-1); > 5710WARN_ON(fs_info->delalloc_block_rsv.size > 0); > 5711WARN_ON(fs_info->delalloc_block_rsv.reserved > 0); > 5712WARN_ON(fs_info->trans_block_rsv.size > 0); > 5713WARN_ON(fs_info->trans_block_rsv.reserved > 0); > 5714WARN_ON(fs_info->chunk_block_rsv.size > 0); > 5715WARN_ON(fs_info->chunk_block_rsv.reserved > 0); > 5716WARN_ON(fs_info->delayed_block_rsv.size > 0); > 5717WARN_ON(fs_info->delayed_block_rsv.reserved > 0); > 5718} Yes it does. But the kernel i'm using is somewhat special i'm using a 4.4 kernel with a patchset from holger (CC'ed). See here: https://github.com/hhoffstaette/kernel-patches/tree/c9cce0933a40db84627241143b123210aee0fde6/4.4.15 >> btrfs_free_block_groups+0x35a/0x400 [btrfs]() >> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas >> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel >> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core >> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod >> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx >> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas >> pps_core >> CPU: 5 PID: 26421 Comm: umount Tainted: GW O4.4.15+43-ph #1 >> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 >> 880ae8b47cd8 bd3c712f >> c03ec603 880ae8b47d18 bd0837e7 0047a000 >> 8806016a1400 8808881d2088 8808881d2000 >> Call Trace: >> [] dump_stack+0x63/0x84 >> [] warn_slowpath_common+0x97/0xe0 >> [] warn_slowpath_null+0x1a/0x20 >> [] btrfs_free_block_groups+0x35a/0x400 [btrfs] >> [] close_ctree+0x15b/0x330 [btrfs] >> [] btrfs_put_super+0x19/0x20 [btrfs] >> [] generic_shutdown_super+0x6f/0x100 >> [] kill_anon_super+0x16/0x30 >> [] btrfs_kill_super+0x1a/0xb0 [btrfs] >> [] deactivate_locked_super+0x51/0x90 >> [] deactivate_super+0x4e/0x70 >> [] cleanup_mnt+0x43/0x90 >> [] __cleanup_mnt+0x12/0x20 >> [] task_work_run+0x7e/0xa0 >> [] exit_to_usermode_loop+0x66/0x95 >> [] syscall_return_slowpath+0xa6/0xf0 >> [] int_ret_from_sys_call+0x25/0x8f >> ---[ end trace bd985b05cc90617f ]--- >> [ cut here ] >> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711 >> btrfs_free_block_groups+0x3f4/0x400 [btrfs]() >> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas >> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables >> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel >> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core >> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod >> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx >> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas >> pps_core >> CPU: 5 PID: 26421 Comm: umount Tainted: GW O4.4.15+43-ph #1 >> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 >> 880ae8b47cd8 bd3c712f >> c03ec603 880ae8b47d18 bd0837e7 0047a000 >> 8806016a1400 f
memory overflow or undeflow in free space tree / space_info?
Dear list, i'm seeing btrfs no space messages frequently on big filesystems (> 30TB). In all cases i'm getting a trace like this one a space_info warning. (since commit [1]). Could someone please be so kind and help me debugging / fixing this bug? I'm using space_cache=v2 on all those systems. [ cut here ] WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]() Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas pps_core CPU: 5 PID: 26421 Comm: umount Tainted: GW O4.4.15+43-ph #1 Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 880ae8b47cd8 bd3c712f c03ec603 880ae8b47d18 bd0837e7 0047a000 8806016a1400 8808881d2088 8808881d2000 Call Trace: [] dump_stack+0x63/0x84 [] warn_slowpath_common+0x97/0xe0 [] warn_slowpath_null+0x1a/0x20 [] btrfs_free_block_groups+0x35a/0x400 [btrfs] [] close_ctree+0x15b/0x330 [btrfs] [] btrfs_put_super+0x19/0x20 [btrfs] [] generic_shutdown_super+0x6f/0x100 [] kill_anon_super+0x16/0x30 [] btrfs_kill_super+0x1a/0xb0 [btrfs] [] deactivate_locked_super+0x51/0x90 [] deactivate_super+0x4e/0x70 [] cleanup_mnt+0x43/0x90 [] __cleanup_mnt+0x12/0x20 [] task_work_run+0x7e/0xa0 [] exit_to_usermode_loop+0x66/0x95 [] syscall_return_slowpath+0xa6/0xf0 [] int_ret_from_sys_call+0x25/0x8f ---[ end trace bd985b05cc90617f ]--- [ cut here ] WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711 btrfs_free_block_groups+0x3f4/0x400 [btrfs]() Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas pps_core CPU: 5 PID: 26421 Comm: umount Tainted: GW O4.4.15+43-ph #1 Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 880ae8b47cd8 bd3c712f c03ec603 880ae8b47d18 bd0837e7 0047a000 8806016a1400 8808881d2088 8808881d2000 Call Trace: [] dump_stack+0x63/0x84 [] warn_slowpath_common+0x97/0xe0 [] warn_slowpath_null+0x1a/0x20 [] btrfs_free_block_groups+0x3f4/0x400 [btrfs] [] close_ctree+0x15b/0x330 [btrfs] [] btrfs_put_super+0x19/0x20 [btrfs] [] generic_shutdown_super+0x6f/0x100 [] kill_anon_super+0x16/0x30 [] btrfs_kill_super+0x1a/0xb0 [btrfs] [] deactivate_locked_super+0x51/0x90 [] deactivate_super+0x4e/0x70 [] cleanup_mnt+0x43/0x90 [] __cleanup_mnt+0x12/0x20 [] task_work_run+0x7e/0xa0 [] exit_to_usermode_loop+0x66/0x95 [] syscall_return_slowpath+0xa6/0xf0 [] int_ret_from_sys_call+0x25/0x8f ---[ end trace bd985b05cc906180 ]--- [ cut here ] WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:9990 btrfs_free_block_groups+0x2a4/0x400 [btrfs]() Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas pps_core CPU: 5 PID: 26421 Comm: umount Tainted: GW O4.4.15+43-ph #1 Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 880ae8b47cd8 bd3c712f c03ec603 880ae8b47d18 bd0837e7 880c6aaa4528 0038 8802fe8d8c88 8808881d2000 Call Trace: [] dump_stack+0x63/0x84 [] warn_slowpath_common+0x97/0xe0 [] warn_slowpath_null+0x1a/0x20 [] btrfs_free_block_groups+0x2a4/0x400 [btrfs] [] close_ctree+0x15b/0x330 [btrfs] [] btrfs_put_super+0x19/0x20 [btrfs] [] generic_shutdown_super+0x6f/0x100 [] kill_anon_super+0x16/0x30 [] btrfs_kill_super+0x1a/0xb0 [btrfs] [] deactivate_locked_super+0x51/0x90 [] deactivate_super+0x4e/0x70 [] cleanup_mnt+0x43/0x90 [] __cleanup_mnt+0x12/0x20 [] task_work_run+0x7e/0xa0
Re: ENOSPC / no space on very large devices
Am 20.07.2016 um 09:35 schrieb Holger Hoffstätte: > On 07/20/16 07:31, Stefan Priebe - Profihost AG wrote: >> Hi list, >> >> while i didn't had the problem for some month i'm now getting ENOSPC on >> a regular basis on one host. > > Well, it's getting better. :) Again the same problem. > >> if i umount the volume i get traces (i already did a clear_cache 4 days >> ago to recalculate the space_tree): >> >> [545031.675797] [ cut here ] >> [545031.725166] WARNING: CPU: 1 PID: 17711 at >> fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]() > > This is "only" a warning, but as we can see below it indicates a real > problem. The warning was added only recently to for-next by the patch called > "Btrfs: warn_on for unaccounted spaces" [1], but I've had it in my tree > forever. Never seen the warning myself. > > (snip) >> [545037.909700] BTRFS: space_info 4 has 18446743523026157568 free, is >> not full > > Wow, ~18.4 exabytes really is a lot of free space. :) > So it looks like something underflowed the space_info and now things are > confused for about ~550 GB. Unfortunately I have no good idea how to fix > that. :( umount triggered this one: [983102.838217] [ cut here ] [983102.864383] WARNING: CPU: 1 PID: 483 at fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]() [983102.894424] Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding usbhid coretemp loop xhci_pci ehci_pci xhci_hcd ehci_hcd i40e(O) sb_edac vxlan ip6_udp_tunnel usbcore ipmi_si i2c_i801 shpchp usb_common udp_tunnel edac_core ipmi_msghandler button btrfs dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod igb i2c_algo_bit i2c_core sg sd_mod ptp ahci libahci pps_core aacraid [983103.043010] CPU: 1 PID: 483 Comm: umount Tainted: G O 4.4.15+43-ph #1 [983103.084441] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 [983103.127289] 880074673cd8 ad3c712f [983103.171816] c0298603 880074673d18 ad0837e7 000b2000 [983103.216766] 88103bd7e600 881037c92088 881037c92000 [983103.262776] Call Trace: [983103.308602] [] dump_stack+0x63/0x84 [983103.355609] [] warn_slowpath_common+0x97/0xe0 [983103.403528] [] warn_slowpath_null+0x1a/0x20 [983103.451297] [] btrfs_free_block_groups+0x35a/0x400 [btrfs] [983103.500439] [] close_ctree+0x15b/0x330 [btrfs] [983103.548805] [] btrfs_put_super+0x19/0x20 [btrfs] [983103.597122] [] generic_shutdown_super+0x6f/0x100 [983103.645398] [] kill_anon_super+0x16/0x30 [983103.693384] [] btrfs_kill_super+0x1a/0xb0 [btrfs] [983103.742430] [] deactivate_locked_super+0x51/0x90 [983103.791501] [] deactivate_super+0x4e/0x70 [983103.839979] [] cleanup_mnt+0x43/0x90 [983103.889050] [] __cleanup_mnt+0x12/0x20 [983103.937756] [] task_work_run+0x7e/0xa0 [983103.986032] [] exit_to_usermode_loop+0x66/0x95 [983104.035214] [] syscall_return_slowpath+0xa6/0xf0 [983104.084312] [] int_ret_from_sys_call+0x25/0x8f [983104.134098] ---[ end trace ca97a745adcb888f ]--- [983104.184540] [ cut here ] [983104.235514] WARNING: CPU: 1 PID: 483 at fs/btrfs/extent-tree.c:5711 btrfs_free_block_groups+0x3f4/0x400 [btrfs]() [983104.290282] Modules linked in: netconsole xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding usbhid coretemp loop xhci_pci ehci_pci xhci_hcd ehci_hcd i40e(O) sb_edac vxlan ip6_udp_tunnel usbcore ipmi_si i2c_i801 shpchp usb_common udp_tunnel edac_core ipmi_msghandler button btrfs dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod igb i2c_algo_bit i2c_core sg sd_mod ptp ahci libahci pps_core aacraid [983104.536076] CPU: 1 PID: 483 Comm: umount Tainted: GW O 4.4.15+43-ph #1 [983104.601962] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 [983104.669312] 880074673cd8 ad3c712f [983104.738337] c0298603 880074673d18 ad0837e7 000b2000 [983104.807874] 88103bd7e600 881037c92088 881037c92000 [983104.878415] Call Trace: [983104.948803] [] dump_stack+0x63/0x84 [983105.020781] [] warn_slowpath_common+0x97/0xe0 [983105.093734] [] warn_slowpath_null+0x1a/0x20 [983105.166888] [] btrfs_free_block_groups+0x3f4/0x400 [btrfs] [983105.241461] [] close_ctree+0x15b/0x330 [btrfs] [983105.316744] [] btrfs_put_super+0x19/0x20 [btrfs] [983105.392021] [] generic_shutdown_super+0x6f/0x100 [983105.465852] [] kill_anon_super+0x16/0x30 [983105.537829] [] btrfs_kill_super+0x1a/0xb0 [btrfs] [983105.608078] [] deactivate_locked_super+0x51/0x90 [983105.676494] [] deactivate_super
Re: ENOSPC / no space on very large devices
here we go... Am 20.07.2016 um 08:31 schrieb Wang Xiaoguang: > hello, > > On 07/20/2016 01:31 PM, Stefan Priebe - Profihost AG wrote: >> Hi list, >> >> while i didn't had the problem for some month i'm now getting ENOSPC on >> a regular basis on one host. >> >> It would be great if someone can help me debugging this. >> >> Some basic informations: >> # touch /vmbackup/abc >> touch: cannot touch `/vmbackup/abc': No space left on device > When touch operation failed, would you please change dir to > /sys/fs/btrfs/UUID/allocation/data/ and show me these files' value. > And also files in /sys/fs/btrfs/UUID/allocation/metadata. thanks. > Here UUID is your real uuid :) /sys/fs/btrfs/ebcb9a5e-d784-4e17-9cd0-bc67fe7b1ed6/allocation/data]# grep -H '' * bytes_may_use:0 bytes_pinned:0 bytes_reserved:0 bytes_used:6175380234240 disk_total:6641093181440 disk_used:6175380234240 flags:1 grep: single: Is a directory total_bytes:6641093181440 total_bytes_pinned:726104035328 /sys/fs/btrfs/ebcb9a5e-d784-4e17-9cd0-bc67fe7b1ed6/allocation/metadata]# grep -H '' * bytes_may_use:2089625649152 bytes_pinned:0 bytes_reserved:0 bytes_used:36823187456 disk_total:95563022336 disk_used:73646374912 grep: dup: Is a directory flags:4 total_bytes:47781511168 total_bytes_pinned:-16792829952 Greets, Stefan > > Regards, > Xiaoguang Wang > >> # df -h /vmbackup/ >> FilesystemSize Used Avail Use% Mounted on >> /dev/mapper/stripe0-vmbackup 37T 28T 8,5T 77% /vmbackup >> >> # btrfs filesystem df /vmbackup/ >> Data, single: total=27.87TiB, used=27.39TiB >> System, DUP: total=8.00MiB, used=4.34MiB >> Metadata, DUP: total=286.50GiB, used=199.91GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> # btrfs filesystem show /vmbackup/ >> Label: none uuid: c8c3abf7-8280-4baa-bb51-a8c599e48002 >> Total devices 1 FS bytes used 27.59TiB >> devid1 size 36.38TiB used 28.43TiB path >> /dev/mapper/stripe0-vmbackup >> >> # mount | grep vmbackup >> /dev/mapper/stripe0-vmbackup on /vmbackup type btrfs >> (rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,clear_cache,commit=300,subvolid=5,subvol=/) >> >> >> dmesg is empty. >> >> if i umount the volume i get traces (i already did a clear_cache 4 days >> ago to recalculate the space_tree): >> >> [545031.675797] [ cut here ] >> [545031.725166] WARNING: CPU: 1 PID: 17711 at >> fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]() >> [545031.778329] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 >> mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter >> ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan >> ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd >> i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler >> button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy >> async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod >> ahci ptp libahci megaraid_sas pps_core >> [545032.081037] CPU: 1 PID: 17711 Comm: umount Tainted: G O >> 4.4.15+43-ph #1 >> [545032.145078] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c >> 02/18/2015 >> [545032.210238] 88010c40bcd8 bd3c712f >> >> [545032.275650] c03ec603 88010c40bd18 bd0837e7 >> 0047a000 >> [545032.341525] 88105e0ea400 881054a76088 >> 881054a76000 >> [545032.408500] Call Trace: >> [545032.475272] [] dump_stack+0x63/0x84 >> [545032.543620] [] warn_slowpath_common+0x97/0xe0 >> [545032.612900] [] warn_slowpath_null+0x1a/0x20 >> [545032.682026] [] >> btrfs_free_block_groups+0x35a/0x400 [btrfs] >> [545032.750297] [] close_ctree+0x15b/0x330 [btrfs] >> [545032.817085] [] btrfs_put_super+0x19/0x20 [btrfs] >> [545032.883439] [] generic_shutdown_super+0x6f/0x100 >> [545032.949302] [] kill_anon_super+0x16/0x30 >> [545033.014327] [] btrfs_kill_super+0x1a/0xb0 [btrfs] >> [545033.079031] [] deactivate_locked_super+0x51/0x90 >> [545033.143275] [] deactivate_super+0x4e/0x70 >> [545033.206535] [] cleanup_mnt+0x43/0x90 >> [545033.268842] [] __cleanup_mnt+0x12/0x20 >> [545033.331629] [] task_work_run+0x7e/0xa0 >> [545033.393350] [] exit_to_usermode_loop+0x66/0x95 >> [545033.454685] [] syscall_return_slowpath+0xa6/0xf0 >> [545033.515485] [] int_ret_from_sys_call+0x25/0x8f >> [545033.575890] ---[ end trace bd985b05cc90617c ]--- >> [545033.636708] -
ENOSPC / no space on very large devices
Hi list, while i didn't had the problem for some month i'm now getting ENOSPC on a regular basis on one host. It would be great if someone can help me debugging this. Some basic informations: # touch /vmbackup/abc touch: cannot touch `/vmbackup/abc': No space left on device # df -h /vmbackup/ FilesystemSize Used Avail Use% Mounted on /dev/mapper/stripe0-vmbackup 37T 28T 8,5T 77% /vmbackup # btrfs filesystem df /vmbackup/ Data, single: total=27.87TiB, used=27.39TiB System, DUP: total=8.00MiB, used=4.34MiB Metadata, DUP: total=286.50GiB, used=199.91GiB GlobalReserve, single: total=512.00MiB, used=0.00B # btrfs filesystem show /vmbackup/ Label: none uuid: c8c3abf7-8280-4baa-bb51-a8c599e48002 Total devices 1 FS bytes used 27.59TiB devid1 size 36.38TiB used 28.43TiB path /dev/mapper/stripe0-vmbackup # mount | grep vmbackup /dev/mapper/stripe0-vmbackup on /vmbackup type btrfs (rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,clear_cache,commit=300,subvolid=5,subvol=/) dmesg is empty. if i umount the volume i get traces (i already did a clear_cache 4 days ago to recalculate the space_tree): [545031.675797] [ cut here ] [545031.725166] WARNING: CPU: 1 PID: 17711 at fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]() [545031.778329] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas pps_core [545032.081037] CPU: 1 PID: 17711 Comm: umount Tainted: G O 4.4.15+43-ph #1 [545032.145078] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 [545032.210238] 88010c40bcd8 bd3c712f [545032.275650] c03ec603 88010c40bd18 bd0837e7 0047a000 [545032.341525] 88105e0ea400 881054a76088 881054a76000 [545032.408500] Call Trace: [545032.475272] [] dump_stack+0x63/0x84 [545032.543620] [] warn_slowpath_common+0x97/0xe0 [545032.612900] [] warn_slowpath_null+0x1a/0x20 [545032.682026] [] btrfs_free_block_groups+0x35a/0x400 [btrfs] [545032.750297] [] close_ctree+0x15b/0x330 [btrfs] [545032.817085] [] btrfs_put_super+0x19/0x20 [btrfs] [545032.883439] [] generic_shutdown_super+0x6f/0x100 [545032.949302] [] kill_anon_super+0x16/0x30 [545033.014327] [] btrfs_kill_super+0x1a/0xb0 [btrfs] [545033.079031] [] deactivate_locked_super+0x51/0x90 [545033.143275] [] deactivate_super+0x4e/0x70 [545033.206535] [] cleanup_mnt+0x43/0x90 [545033.268842] [] __cleanup_mnt+0x12/0x20 [545033.331629] [] task_work_run+0x7e/0xa0 [545033.393350] [] exit_to_usermode_loop+0x66/0x95 [545033.454685] [] syscall_return_slowpath+0xa6/0xf0 [545033.515485] [] int_ret_from_sys_call+0x25/0x8f [545033.575890] ---[ end trace bd985b05cc90617c ]--- [545033.636708] [ cut here ] [545033.696339] WARNING: CPU: 1 PID: 17711 at fs/btrfs/extent-tree.c:5711 btrfs_free_block_groups+0x3f4/0x400 [btrfs]() [545033.758031] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas pps_core [545034.095188] CPU: 1 PID: 17711 Comm: umount Tainted: GW O 4.4.15+43-ph #1 [545034.166070] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 [545034.236259] 88010c40bcd8 bd3c712f [545034.307690] c03ec603 88010c40bd18 bd0837e7 0047a000 [545034.379596] 88105e0ea400 881054a76088 881054a76000 [545034.452542] Call Trace: [545034.525286] [] dump_stack+0x63/0x84 [545034.599643] [] warn_slowpath_common+0x97/0xe0 [545034.674894] [] warn_slowpath_null+0x1a/0x20 [545034.750338] [] btrfs_free_block_groups+0x3f4/0x400 [btrfs] [545034.826354] [] close_ctree+0x15b/0x330 [btrfs] [545034.900758] [] btrfs_put_super+0x19/0x20 [btrfs] [545034.973612] [] generic_shutdown_super+0x6f/0x100 [545035.044589] [] kill_anon_super+0x16/0x30 [545035.113505] [] btrfs_kill_super+0x1a/0xb0 [btrfs] [545035.180769] [] deactivate_locked_super+0x51/0x90 [545035.246451] [] deactivate_super+0x4e/0x70 [545035.311231] [] cleanup_mnt+0x43/0x90 [545035.374958] []