big volumes only work reliable with ssd_spread

2018-01-15 Thread Stefan Priebe - Profihost AG
Hello,

since around two or three years i'm using btrfs for incremental VM backups.

some data:
- volume size 60TB
- around 2000 subvolumes
- each differential backup stacks on top of a subvolume
- compress-force=zstd
- space_cache=v2
- no quote / qgroup

this works fine since Kernel 4.14 except that i need ssd_spread as an
option. If i do not use ssd_spread i always end up with very slow
performance and a single kworker process using 100% CPU after some days.

With ssd_spread those boxes run fine since around 6 month. Is this
something expected? I haven't found any hint regarding such an impact.

Thanks!

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to repair or access broken btrfs?

2017-11-14 Thread Stefan Priebe - Profihost AG

Am 14.11.2017 um 18:45 schrieb Andrei Borzenkov:
> 14.11.2017 12:56, Stefan Priebe - Profihost AG пишет:
>> Hello,
>>
>> after a controller firmware bug / failure i've a broken btrfs.
>>
>> # parent transid verify failed on 181846016 wanted 143404 found 143399
>>
>> running repair, fsck or zero-log always results in the same failure message:
>> extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered,
>> value -1
>> .. stack trace ..
>>
>> Is there an chance to get at least a single file out of the broken fs?
>>
> 
> Did you try "btrfs restore"?

Great that worked for that file. Still wondering why a repair is not
possible.

Greets,
Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


how to repair or access broken btrfs?

2017-11-14 Thread Stefan Priebe - Profihost AG
Hello,

after a controller firmware bug / failure i've a broken btrfs.

# parent transid verify failed on 181846016 wanted 143404 found 143399

running repair, fsck or zero-log always results in the same failure message:
extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered,
value -1
.. stack trace ..

Is there an chance to get at least a single file out of the broken fs?

Greets,
Stefan


Complete output:
./btrfs check --repair /dev/mapper/crypt_md0
enabling repair mode
parent transid verify failed on 181846016 wanted 143404 found 143399
parent transid verify failed on 181846016 wanted 143404 found 143399
Ignoring transid failure
Checking filesystem on /dev/mapper/crypt_md0
UUID: d3f9eee9-efbd-4590-858f-27b39d453350
repair mode will force to clear out log tree, are you sure? [y/N]: y
parent transid verify failed on 308183040 wanted 143404 found 143399
parent transid verify failed on 308183040 wanted 143404 found 143399
Ignoring transid failure
parent transid verify failed on 338870272 wanted 143404 found 143399
parent transid verify failed on 338870272 wanted 143404 found 143399
Ignoring transid failure
parent transid verify failed on 12778157178880 wanted 143404 found 143399
parent transid verify failed on 12778157178880 wanted 143404 found 143399
Ignoring transid failure
leaf parent key incorrect 38699008
btrfs unable to find ref byte nr 12778147823616 parent 0 root 2  owner 0
offset 0
parent transid verify failed on 308183040 wanted 143404 found 143399
Ignoring transid failure
leaf parent key incorrect 91766784
extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered,
value -1
./btrfs[0x415cb3]
./btrfs[0x416ee5]
./btrfs[0x417104]
./btrfs[0x418cea]
./btrfs[0x418f06]
./btrfs(btrfs_alloc_free_block+0x1e4)[0x41b8d0]
./btrfs(__btrfs_cow_block+0xd3)[0x40c5f9]
./btrfs(btrfs_cow_block+0x110)[0x40d03b]
./btrfs(commit_tree_roots+0x53)[0x439a37]
./btrfs(btrfs_commit_transaction+0xf9)[0x439e02]
./btrfs(cmd_check+0x861)[0x46172e]
./btrfs(main+0x163)[0x40b5e9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f44b14fab45]
./btrfs[0x40b0b9]
Aborted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs-progs: check --repair crashes with BUG ON

2017-11-04 Thread Stefan Priebe - Profihost AG
Hello,

after a power failure i have a btrfs volume which isn't mountable.

dmesg shows:
parent transid verify failed on 181846016 wanted 143404 found 143399

If i run:
btrfs check --repair /dev/mapper/crypt_md1

The output is:
parent transid verify failed on 181846016 wanted 143404 found 143399
parent transid verify failed on 181846016 wanted 143404 found 143399
Ignoring transid failure
Clearing log on /dev/mapper/crypt_md0, previous log_root 1520200695808,
level 0
parent transid verify failed on 308183040 wanted 143404 found 143399
parent transid verify failed on 308183040 wanted 143404 found 143399
Ignoring transid failure
parent transid verify failed on 338870272 wanted 143404 found 143399
parent transid verify failed on 338870272 wanted 143404 found 143399
Ignoring transid failure
parent transid verify failed on 12778157178880 wanted 143404 found 143399
parent transid verify failed on 12778157178880 wanted 143404 found 143399
Ignoring transid failure
leaf parent key incorrect 38699008
btrfs unable to find ref byte nr 12778147823616 parent 0 root 2  owner 0
offset 0
parent transid verify failed on 308183040 wanted 143404 found 143399
Ignoring transid failure
leaf parent key incorrect 91766784
extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered,
value -1
./btrfs[0x415cb3]
./btrfs[0x416ee5]
./btrfs[0x417104]
./btrfs[0x418cea]
./btrfs[0x418f06]
./btrfs(btrfs_alloc_free_block+0x1e4)[0x41b8d0]
./btrfs(__btrfs_cow_block+0xd3)[0x40c5f9]
./btrfs(btrfs_cow_block+0x110)[0x40d03b]
./btrfs(commit_tree_roots+0x53)[0x439baa]
./btrfs(btrfs_commit_transaction+0xf9)[0x439f75]
./btrfs[0x467212]
./btrfs(handle_command_group+0x5d)[0x40b360]
./btrfs(cmd_rescue+0x15)[0x46749f]
./btrfs(main+0x163)[0x40b5e9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fc63f25db45]
./btrfs[0x40b0b9]
Aborted

This is btrfs-progs branch: devel - same happens with master or v4.13.3.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs is slow while looping in search_bitmap <-btrfs_find_space_for_alloc

2017-09-11 Thread Stefan Priebe - Profihost AG
Am 05.09.2017 um 07:58 schrieb Stefan Priebe - Profihost AG:
> Hello,
> 
> while expecting slow btrfs volumes i switched to kernel v4.13 and to
> space_cache=v2.
...
> 
> Is btrfs trying to hard to find free space?
Even nobody replied - i reply to myself. I could completely "fix" this
by using ssd_spread option for my raid50.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: speed up big btrfs volumes with ssds

2017-09-11 Thread Stefan Priebe - Profihost AG
Hello,

Am 04.09.2017 um 20:32 schrieb Stefan Priebe - Profihost AG:
> Am 04.09.2017 um 15:28 schrieb Timofey Titovets:
>> 2017-09-04 15:57 GMT+03:00 Stefan Priebe - Profihost AG 
>> <s.pri...@profihost.ag>:
>>> Am 04.09.2017 um 12:53 schrieb Henk Slager:
>>>> On Sun, Sep 3, 2017 at 8:32 PM, Stefan Priebe - Profihost AG
>>>> <s.pri...@profihost.ag> wrote:
>>>>> Hello,
>>>>>
>>>>> i'm trying to speed up big btrfs volumes.
>>>>>
>>>>> Some facts:
>>>>> - Kernel will be 4.13-rc7
>>>>> - needed volume size is 60TB
>>>>>
>>>>> Currently without any ssds i get the best speed with:
>>>>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices
>>>>>
>>>>> and using btrfs as raid 0 for data and metadata on top of those 4 raid 5.
>>>>>
>>>>> I can live with a data loss every now and and than ;-) so a raid 0 on
>>>>> top of the 4x radi5 is acceptable for me.
>>>>>
>>>>> Currently the write speed is not as good as i would like - especially
>>>>> for random 8k-16k I/O.
>>>>>
>>>>> My current idea is to use a pcie flash card with bcache on top of each
>>>>> raid 5.
>>>>
>>>> If it can speed up depends quite a lot on what the use-case is, for
>>>> some not-so-much-parallel-access it might work. So this 60TB is then
>>>> 20 4TB disks or so and the 4x 1GB cache is simply not very helpful I
>>>> think. The working set doesn't fit in it I guess. If there is mostly
>>>> single or a few users of the fs, a single pcie based bcacheing 4
>>>> devices can work, but for SATA SSD, I would use 1 SSD per HWraid5.
>>>
>>> Yes that's roughly my idea as well and yes the workload is 4 users max
>>> writing data. 50% sequential, 50% random.
>>>
>>>> Then roughly make sure the complete set of metadata blocks fits in the
>>>> cache. For an fs of this size let's say/estimate 150G. Then maybe same
>>>> of double for data, so an SSD of 500G would be a first try.
>>>
>>> I would use 1TB devices for each Raid or a 4TB PCIe card.
>>>
>>>> You give the impression that reliability for this fs is not the
>>>> highest prio, so if you go full risk, then put bcache in write-back
>>>> mode, then you will have your desired random 8k-16k I/O speedup after
>>>> the cache is warmed up. But any SW or HW failure wil result in total
>>>> fs loss normally if SSD and HDD get out of sync somehow. Bcache
>>>> write-through might also be acceptable, you will need extensive
>>>> monitoring and tuning of all (bcache) parameters etc to be sure of the
>>>> right choice of size and setup etc.
>>>
>>> Yes i wanted to use the write back mode. Has anybody already made some
>>> test or experience with a setup like this?
>>>
>>
>> May be you can make work your raid setup faster by:
>> 1. Use Single Profile
> 
> I'm already using the raid0 profile - see below:
> 
> Data,RAID0: Size:22.57TiB, Used:21.08TiB
> Metadata,RAID0: Size:90.00GiB, Used:82.28GiB
> System,RAID0: Size:64.00MiB, Used:1.53MiB
> 
>> 2. Use different stripe size for HW RAID5:
>> i think 16kb will be optimal with 5 devices per raid group
>> That will give you 64kb data stripe and 16kb parity
>> Btrfs raid0 use 64kb as stripe so that can make data access
>> unaligned (or use single profile for btrfs)
> 
> That sounds like an interesting idea except for the unaligned writes.
> Will need to test this.
> 
>> 3. Use btrfs ssd_spread to decrease RMW cycles.
> Can you explain this?
> 
> Stefan

i was able to fix this issue with ssd_spread. Could it be that the
default allocators nossd and ssd are searching to hard to free space?
Even space_tree did not help.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs is slow while looping in search_bitmap <-btrfs_find_space_for_alloc

2017-09-04 Thread Stefan Priebe - Profihost AG
Hello,

while expecting slow btrfs volumes i switched to kernel v4.13 and to
space_cache=v2.

But i'm still expecting slow performance and single kworker processes
using 100% CPU.

Tracing the kworker process shows me:
# sed 's/.*: //' /trace | sort | uniq -c | sort -n
  21595 tree_search_offset.isra.23 <-btrfs_find_space_for_alloc
  21610 btrfs_find_space_for_alloc <-find_free_extent
  21619 _raw_spin_lock <-btrfs_find_space_for_alloc
  27431 _cond_resched <-find_free_extent
  27437 down_read <-find_free_extent
  27451 block_group_cache_done.isra.29 <-find_free_extent
  27451 btrfs_put_block_group <-find_free_extent
  27464 up_read <-find_free_extent
  27486 __get_raid_index <-find_free_extent
  27503 _raw_spin_lock <-find_free_extent
  48335 search_bitmap <-btrfs_find_space_for_alloc

Is there anything to optimize? Can i speed up this?

There's still plenty of unallocated space:
# btrfs fi usage /vmbackup/
Overall:
Device size:  58.20TiB
Device allocated: 22.66TiB
Device unallocated:   35.54TiB
Device missing:  0.00B
Used: 21.07TiB
Free (estimated): 37.12TiB  (min: 37.12TiB)
Data ratio:   1.00
Metadata ratio:   1.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID0: Size:22.57TiB, Used:20.99TiB
   /dev/sdc1   5.64TiB
   /dev/sdd1   5.64TiB
   /dev/sde1   5.64TiB
   /dev/sdf1   5.64TiB

Metadata,RAID0: Size:90.00GiB, Used:81.60GiB
   /dev/sdc1  22.50GiB
   /dev/sdd1  22.50GiB
   /dev/sde1  22.50GiB
   /dev/sdf1  22.50GiB

System,RAID0: Size:64.00MiB, Used:1.53MiB
   /dev/sdc1  16.00MiB
   /dev/sdd1  16.00MiB
   /dev/sde1  16.00MiB
   /dev/sdf1  16.00MiB

Unallocated:
   /dev/sdc1   8.88TiB
   /dev/sdd1   8.88TiB
   /dev/sde1   8.88TiB
   /dev/sdf1   8.88TiB

Is btrfs trying to hard to find free space?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: speed up big btrfs volumes with ssds

2017-09-04 Thread Stefan Priebe - Profihost AG
Am 04.09.2017 um 15:28 schrieb Timofey Titovets:
> 2017-09-04 15:57 GMT+03:00 Stefan Priebe - Profihost AG 
> <s.pri...@profihost.ag>:
>> Am 04.09.2017 um 12:53 schrieb Henk Slager:
>>> On Sun, Sep 3, 2017 at 8:32 PM, Stefan Priebe - Profihost AG
>>> <s.pri...@profihost.ag> wrote:
>>>> Hello,
>>>>
>>>> i'm trying to speed up big btrfs volumes.
>>>>
>>>> Some facts:
>>>> - Kernel will be 4.13-rc7
>>>> - needed volume size is 60TB
>>>>
>>>> Currently without any ssds i get the best speed with:
>>>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices
>>>>
>>>> and using btrfs as raid 0 for data and metadata on top of those 4 raid 5.
>>>>
>>>> I can live with a data loss every now and and than ;-) so a raid 0 on
>>>> top of the 4x radi5 is acceptable for me.
>>>>
>>>> Currently the write speed is not as good as i would like - especially
>>>> for random 8k-16k I/O.
>>>>
>>>> My current idea is to use a pcie flash card with bcache on top of each
>>>> raid 5.
>>>
>>> If it can speed up depends quite a lot on what the use-case is, for
>>> some not-so-much-parallel-access it might work. So this 60TB is then
>>> 20 4TB disks or so and the 4x 1GB cache is simply not very helpful I
>>> think. The working set doesn't fit in it I guess. If there is mostly
>>> single or a few users of the fs, a single pcie based bcacheing 4
>>> devices can work, but for SATA SSD, I would use 1 SSD per HWraid5.
>>
>> Yes that's roughly my idea as well and yes the workload is 4 users max
>> writing data. 50% sequential, 50% random.
>>
>>> Then roughly make sure the complete set of metadata blocks fits in the
>>> cache. For an fs of this size let's say/estimate 150G. Then maybe same
>>> of double for data, so an SSD of 500G would be a first try.
>>
>> I would use 1TB devices for each Raid or a 4TB PCIe card.
>>
>>> You give the impression that reliability for this fs is not the
>>> highest prio, so if you go full risk, then put bcache in write-back
>>> mode, then you will have your desired random 8k-16k I/O speedup after
>>> the cache is warmed up. But any SW or HW failure wil result in total
>>> fs loss normally if SSD and HDD get out of sync somehow. Bcache
>>> write-through might also be acceptable, you will need extensive
>>> monitoring and tuning of all (bcache) parameters etc to be sure of the
>>> right choice of size and setup etc.
>>
>> Yes i wanted to use the write back mode. Has anybody already made some
>> test or experience with a setup like this?
>>
> 
> May be you can make work your raid setup faster by:
> 1. Use Single Profile

I'm already using the raid0 profile - see below:

Data,RAID0: Size:22.57TiB, Used:21.08TiB
Metadata,RAID0: Size:90.00GiB, Used:82.28GiB
System,RAID0: Size:64.00MiB, Used:1.53MiB

> 2. Use different stripe size for HW RAID5:
> i think 16kb will be optimal with 5 devices per raid group
> That will give you 64kb data stripe and 16kb parity
> Btrfs raid0 use 64kb as stripe so that can make data access
> unaligned (or use single profile for btrfs)

That sounds like an interesting idea except for the unaligned writes.
Will need to test this.

> 3. Use btrfs ssd_spread to decrease RMW cycles.
Can you explain this?

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: speed up big btrfs volumes with ssds

2017-09-04 Thread Stefan Priebe - Profihost AG
Am 04.09.2017 um 12:53 schrieb Henk Slager:
> On Sun, Sep 3, 2017 at 8:32 PM, Stefan Priebe - Profihost AG
> <s.pri...@profihost.ag> wrote:
>> Hello,
>>
>> i'm trying to speed up big btrfs volumes.
>>
>> Some facts:
>> - Kernel will be 4.13-rc7
>> - needed volume size is 60TB
>>
>> Currently without any ssds i get the best speed with:
>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices
>>
>> and using btrfs as raid 0 for data and metadata on top of those 4 raid 5.
>>
>> I can live with a data loss every now and and than ;-) so a raid 0 on
>> top of the 4x radi5 is acceptable for me.
>>
>> Currently the write speed is not as good as i would like - especially
>> for random 8k-16k I/O.
>>
>> My current idea is to use a pcie flash card with bcache on top of each
>> raid 5.
> 
> If it can speed up depends quite a lot on what the use-case is, for
> some not-so-much-parallel-access it might work. So this 60TB is then
> 20 4TB disks or so and the 4x 1GB cache is simply not very helpful I
> think. The working set doesn't fit in it I guess. If there is mostly
> single or a few users of the fs, a single pcie based bcacheing 4
> devices can work, but for SATA SSD, I would use 1 SSD per HWraid5.

Yes that's roughly my idea as well and yes the workload is 4 users max
writing data. 50% sequential, 50% random.

> Then roughly make sure the complete set of metadata blocks fits in the
> cache. For an fs of this size let's say/estimate 150G. Then maybe same
> of double for data, so an SSD of 500G would be a first try.

I would use 1TB devices for each Raid or a 4TB PCIe card.

> You give the impression that reliability for this fs is not the
> highest prio, so if you go full risk, then put bcache in write-back
> mode, then you will have your desired random 8k-16k I/O speedup after
> the cache is warmed up. But any SW or HW failure wil result in total
> fs loss normally if SSD and HDD get out of sync somehow. Bcache
> write-through might also be acceptable, you will need extensive
> monitoring and tuning of all (bcache) parameters etc to be sure of the
> right choice of size and setup etc.

Yes i wanted to use the write back mode. Has anybody already made some
test or experience with a setup like this?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


speed up big btrfs volumes with ssds

2017-09-03 Thread Stefan Priebe - Profihost AG
Hello,

i'm trying to speed up big btrfs volumes.

Some facts:
- Kernel will be 4.13-rc7
- needed volume size is 60TB

Currently without any ssds i get the best speed with:
- 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices

and using btrfs as raid 0 for data and metadata on top of those 4 raid 5.

I can live with a data loss every now and and than ;-) so a raid 0 on
top of the 4x radi5 is acceptable for me.

Currently the write speed is not as good as i would like - especially
for random 8k-16k I/O.

My current idea is to use a pcie flash card with bcache on top of each
raid 5.

Is this something which makes sense to speed up the write speed.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow btrfs with a single kworker process using 100% CPU

2017-08-28 Thread Stefan Priebe - Profihost AG
n_read
<-find_free_extent
   kworker/u24:4-13405 [003]  344186.202598:
block_group_cache_done.isra.27 <-find_free_extent
   kworker/u24:4-13405 [003]  344186.202598: _raw_spin_lock
<-find_free_extent
   kworker/u24:4-13405 [003]  344186.202598:
btrfs_find_space_for_alloc <-find_free_extent
   kworker/u24:4-13405 [003]  344186.202598: _raw_spin_lock
<-btrfs_find_space_for_alloc
   kworker/u24:4-13405 [003]  344186.202599:
tree_search_offset.isra.25 <-btrfs_find_space_for_alloc
   kworker/u24:4-13405 [003]  344186.202623: __get_raid_index
<-find_free_extent
   kworker/u24:4-13405 [003]  344186.202623: up_read <-find_free_extent
   kworker/u24:4-13405 [003]  344186.202623: btrfs_put_block_group
<-find_free_extent
   kworker/u24:4-13405 [003]  344186.202623: _cond_resched
<-find_free_extent

Greets,
Stefan

Am 20.08.2017 um 13:00 schrieb Stefan Priebe - Profihost AG:
> Hello,
> 
> this still happens with space_cache v2. I don't think it is space_cache
> related?
> 
> Stefan
> 
> Am 17.08.2017 um 09:43 schrieb Stefan Priebe - Profihost AG:
>> while mounting the device the dmesg is full of:
>> [ 1320.325147]  [] ? kthread_park+0x60/0x60
>> [ 1440.330008] INFO: task btrfs-transacti:3701 blocked for more than 120
>> seconds.
>> [ 1440.330014]   Not tainted 4.4.82+525-ph #1
>> [ 1440.330015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 1440.330020] btrfs-transacti D 88080964fdd8 0  3701  2
>> 0x0008
>> [ 1440.330024]  88080964fdd8 a8e10500 880859d4cb00
>> 88080965
>> [ 1440.330026]  881056069800 88080964fe08 88080a10
>> 88080a100068
>> [ 1440.330028]  88080964fdf0 a86d2b75 880036c92000
>> 88080964fe58
>> [ 1440.330028] Call Trace:
>> [ 1440.330053]  [] schedule+0x35/0x80
>> [ 1440.330120]  []
>> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
>> [ 1440.330159]  [] btrfs_commit_transaction+0x3a/0x70
>> [btrfs]
>> [ 1440.330186]  [] transaction_kthread+0x1d5/0x240 [btrfs]
>> [ 1440.330194]  [] kthread+0xeb/0x110
>> [ 1440.330200]  [] ret_from_fork+0x3f/0x70
>> [ 1440.16] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
>>
>> [ 1440.17] Leftover inexact backtrace:
>>
>> [ 1440.22]  [] ? kthread_park+0x60/0x60
>> [ 1560.335839] INFO: task btrfs-transacti:3701 blocked for more than 120
>> seconds.
>> [ 1560.335843]   Not tainted 4.4.82+525-ph #1
>> [ 1560.335843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 1560.335848] btrfs-transacti D 88080964fdd8 0  3701  2
>> 0x0008
>> [ 1560.335852]  88080964fdd8 a8e10500 880859d4cb00
>> 88080965
>> [ 1560.335854]  881056069800 88080964fe08 88080a10
>> 88080a100068
>> [ 1560.335856]  88080964fdf0 a86d2b75 880036c92000
>> 88080964fe58
>> [ 1560.335857] Call Trace:
>> [ 1560.335875]  [] schedule+0x35/0x80
>> [ 1560.335953]  []
>> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
>> [ 1560.335978]  [] btrfs_commit_transaction+0x3a/0x70
>> [btrfs]
>> [ 1560.335995]  [] transaction_kthread+0x1d5/0x240 [btrfs]
>> [ 1560.336001]  [] kthread+0xeb/0x110
>> [ 1560.336006]  [] ret_from_fork+0x3f/0x70
>> [ 1560.337829] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
>>
>> [ 1560.337830] Leftover inexact backtrace:
>>
>> [ 1560.337833]  [] ? kthread_park+0x60/0x60
>> [ 1680.341127] INFO: task btrfs-transacti:3701 blocked for more than 120
>> seconds.
>> [ 1680.341130]   Not tainted 4.4.82+525-ph #1
>> [ 1680.341131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 1680.341134] btrfs-transacti D 88080964fdd8 0  3701  2
>> 0x0008
>> [ 1680.341137]  88080964fdd8 a8e10500 880859d4cb00
>> 88080965
>> [ 1680.341138]  881056069800 88080964fe08 88080a10
>> 88080a100068
>> [ 1680.341139]  88080964fdf0 a86d2b75 880036c92000
>> 88080964fe58
>> [ 1680.341140] Call Trace:
>> [ 1680.341155]  [] schedule+0x35/0x80
>> [ 1680.341211]  []
>> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
>> [ 1680.341237]  [] btrfs_commit_transaction+0x3a/0x70
>> [btrfs]
>> [ 1680.341252]  [] transaction_kthread+0x1d5/0x240 [btrfs]
>> [ 1680.341258]  [] kthread+0xeb/0x110
>> [ 1680.341262]  [] ret_from_fork+0x3f/0x70
>> [ 1680.343062] DWARF2 unwinder stuck at ret_from_f

Re: slow btrfs with a single kworker process using 100% CPU

2017-08-20 Thread Stefan Priebe - Profihost AG
Hello,

this still happens with space_cache v2. I don't think it is space_cache
related?

Stefan

Am 17.08.2017 um 09:43 schrieb Stefan Priebe - Profihost AG:
> while mounting the device the dmesg is full of:
> [ 1320.325147]  [] ? kthread_park+0x60/0x60
> [ 1440.330008] INFO: task btrfs-transacti:3701 blocked for more than 120
> seconds.
> [ 1440.330014]   Not tainted 4.4.82+525-ph #1
> [ 1440.330015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1440.330020] btrfs-transacti D 88080964fdd8 0  3701  2
> 0x0008
> [ 1440.330024]  88080964fdd8 a8e10500 880859d4cb00
> 88080965
> [ 1440.330026]  881056069800 88080964fe08 88080a10
> 88080a100068
> [ 1440.330028]  88080964fdf0 a86d2b75 880036c92000
> 88080964fe58
> [ 1440.330028] Call Trace:
> [ 1440.330053]  [] schedule+0x35/0x80
> [ 1440.330120]  []
> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
> [ 1440.330159]  [] btrfs_commit_transaction+0x3a/0x70
> [btrfs]
> [ 1440.330186]  [] transaction_kthread+0x1d5/0x240 [btrfs]
> [ 1440.330194]  [] kthread+0xeb/0x110
> [ 1440.330200]  [] ret_from_fork+0x3f/0x70
> [ 1440.16] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
> 
> [ 1440.17] Leftover inexact backtrace:
> 
> [ 1440.22]  [] ? kthread_park+0x60/0x60
> [ 1560.335839] INFO: task btrfs-transacti:3701 blocked for more than 120
> seconds.
> [ 1560.335843]   Not tainted 4.4.82+525-ph #1
> [ 1560.335843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1560.335848] btrfs-transacti D 88080964fdd8 0  3701  2
> 0x0008
> [ 1560.335852]  88080964fdd8 a8e10500 880859d4cb00
> 88080965
> [ 1560.335854]  881056069800 88080964fe08 88080a10
> 88080a100068
> [ 1560.335856]  88080964fdf0 a86d2b75 880036c92000
> 88080964fe58
> [ 1560.335857] Call Trace:
> [ 1560.335875]  [] schedule+0x35/0x80
> [ 1560.335953]  []
> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
> [ 1560.335978]  [] btrfs_commit_transaction+0x3a/0x70
> [btrfs]
> [ 1560.335995]  [] transaction_kthread+0x1d5/0x240 [btrfs]
> [ 1560.336001]  [] kthread+0xeb/0x110
> [ 1560.336006]  [] ret_from_fork+0x3f/0x70
> [ 1560.337829] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
> 
> [ 1560.337830] Leftover inexact backtrace:
> 
> [ 1560.337833]  [] ? kthread_park+0x60/0x60
> [ 1680.341127] INFO: task btrfs-transacti:3701 blocked for more than 120
> seconds.
> [ 1680.341130]   Not tainted 4.4.82+525-ph #1
> [ 1680.341131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1680.341134] btrfs-transacti D 88080964fdd8 0  3701  2
> 0x0008
> [ 1680.341137]  88080964fdd8 a8e10500 880859d4cb00
> 88080965
> [ 1680.341138]  881056069800 88080964fe08 88080a10
> 88080a100068
> [ 1680.341139]  88080964fdf0 a86d2b75 880036c92000
> 88080964fe58
> [ 1680.341140] Call Trace:
> [ 1680.341155]  [] schedule+0x35/0x80
> [ 1680.341211]  []
> btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
> [ 1680.341237]  [] btrfs_commit_transaction+0x3a/0x70
> [btrfs]
> [ 1680.341252]  [] transaction_kthread+0x1d5/0x240 [btrfs]
> [ 1680.341258]  [] kthread+0xeb/0x110
> [ 1680.341262]  [] ret_from_fork+0x3f/0x70
> [ 1680.343062] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
> 
> Stefan
> 
> Am 17.08.2017 um 07:47 schrieb Stefan Priebe - Profihost AG:
>> i've backported the free space cache tree to my kerne and hopefully any
>> fixes related to it.
>>
>> The first mount with clear_cache,space_cache=v2 took around 5 hours.
>>
>> Currently i do not see any kworker with 100CPU but i don't see much load
>> at all.
>>
>> btrfs-transaction tooks around 2-4% CPU together with a kworker process
>> and some 2-3% mdadm processes. I/O Wait is at 3%.
>>
>> That's it. It does not do much more. Writing a file does not work.
>>
>> Greets,
>> Stefan
>>
>> Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko:
>>> Roman, initially I had a single process occupying 100% CPU, when sysrq it 
>>> was indicating as "btrfs_find_space_for_alloc"
>>> but that's when I used the autodefrag, compress, forcecompress and 
>>> commit=10 mount flags and space_cache was v1 by default.
>>> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% 
>>> cpu has dissapeared, but the shite performance remained.
>>>
>>>
>>>

Re: slow btrfs with a single kworker process using 100% CPU

2017-08-17 Thread Stefan Priebe - Profihost AG
while mounting the device the dmesg is full of:
[ 1320.325147]  [] ? kthread_park+0x60/0x60
[ 1440.330008] INFO: task btrfs-transacti:3701 blocked for more than 120
seconds.
[ 1440.330014]   Not tainted 4.4.82+525-ph #1
[ 1440.330015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1440.330020] btrfs-transacti D 88080964fdd8 0  3701  2
0x0008
[ 1440.330024]  88080964fdd8 a8e10500 880859d4cb00
88080965
[ 1440.330026]  881056069800 88080964fe08 88080a10
88080a100068
[ 1440.330028]  88080964fdf0 a86d2b75 880036c92000
88080964fe58
[ 1440.330028] Call Trace:
[ 1440.330053]  [] schedule+0x35/0x80
[ 1440.330120]  []
btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
[ 1440.330159]  [] btrfs_commit_transaction+0x3a/0x70
[btrfs]
[ 1440.330186]  [] transaction_kthread+0x1d5/0x240 [btrfs]
[ 1440.330194]  [] kthread+0xeb/0x110
[ 1440.330200]  [] ret_from_fork+0x3f/0x70
[ 1440.16] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

[ 1440.17] Leftover inexact backtrace:

[ 1440.22]  [] ? kthread_park+0x60/0x60
[ 1560.335839] INFO: task btrfs-transacti:3701 blocked for more than 120
seconds.
[ 1560.335843]   Not tainted 4.4.82+525-ph #1
[ 1560.335843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1560.335848] btrfs-transacti D 88080964fdd8 0  3701  2
0x0008
[ 1560.335852]  88080964fdd8 a8e10500 880859d4cb00
88080965
[ 1560.335854]  881056069800 88080964fe08 88080a10
88080a100068
[ 1560.335856]  88080964fdf0 a86d2b75 880036c92000
88080964fe58
[ 1560.335857] Call Trace:
[ 1560.335875]  [] schedule+0x35/0x80
[ 1560.335953]  []
btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
[ 1560.335978]  [] btrfs_commit_transaction+0x3a/0x70
[btrfs]
[ 1560.335995]  [] transaction_kthread+0x1d5/0x240 [btrfs]
[ 1560.336001]  [] kthread+0xeb/0x110
[ 1560.336006]  [] ret_from_fork+0x3f/0x70
[ 1560.337829] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

[ 1560.337830] Leftover inexact backtrace:

[ 1560.337833]  [] ? kthread_park+0x60/0x60
[ 1680.341127] INFO: task btrfs-transacti:3701 blocked for more than 120
seconds.
[ 1680.341130]   Not tainted 4.4.82+525-ph #1
[ 1680.341131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.341134] btrfs-transacti D 88080964fdd8 0  3701  2
0x0008
[ 1680.341137]  88080964fdd8 a8e10500 880859d4cb00
88080965
[ 1680.341138]  881056069800 88080964fe08 88080a10
88080a100068
[ 1680.341139]  88080964fdf0 a86d2b75 880036c92000
88080964fe58
[ 1680.341140] Call Trace:
[ 1680.341155]  [] schedule+0x35/0x80
[ 1680.341211]  []
btrfs_commit_transaction.part.24+0x245/0xa30 [btrfs]
[ 1680.341237]  [] btrfs_commit_transaction+0x3a/0x70
[btrfs]
[ 1680.341252]  [] transaction_kthread+0x1d5/0x240 [btrfs]
[ 1680.341258]  [] kthread+0xeb/0x110
[ 1680.341262]  [] ret_from_fork+0x3f/0x70
[ 1680.343062] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

Stefan

Am 17.08.2017 um 07:47 schrieb Stefan Priebe - Profihost AG:
> i've backported the free space cache tree to my kerne and hopefully any
> fixes related to it.
> 
> The first mount with clear_cache,space_cache=v2 took around 5 hours.
> 
> Currently i do not see any kworker with 100CPU but i don't see much load
> at all.
> 
> btrfs-transaction tooks around 2-4% CPU together with a kworker process
> and some 2-3% mdadm processes. I/O Wait is at 3%.
> 
> That's it. It does not do much more. Writing a file does not work.
> 
> Greets,
> Stefan
> 
> Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko:
>> Roman, initially I had a single process occupying 100% CPU, when sysrq it 
>> was indicating as "btrfs_find_space_for_alloc"
>> but that's when I used the autodefrag, compress, forcecompress and commit=10 
>> mount flags and space_cache was v1 by default.
>> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% 
>> cpu has dissapeared, but the shite performance remained.
>>
>>
>> As to the chunk size, there is no information in the article about the type 
>> of data that was used. While in our case we are pretty certain about the 
>> compressed block size (32-128). I am currently inclining towards 32k as it 
>> might be ideal in a situation when we have a 5 disk raid5 array.
>>
>> In theory
>> 1. The minimum compressed write (32k) would fill the chunk on a single disk, 
>> thus the IO cost of the operation would be 2 reads (original chunk + 
>> original parity)  and 2 writes (new chunk + new parity)
>>
>> 2. The maximum compressed write (128k) would require the update of 1 chunk 
&

Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG
i've backported the free space cache tree to my kerne and hopefully any
fixes related to it.

The first mount with clear_cache,space_cache=v2 took around 5 hours.

Currently i do not see any kworker with 100CPU but i don't see much load
at all.

btrfs-transaction tooks around 2-4% CPU together with a kworker process
and some 2-3% mdadm processes. I/O Wait is at 3%.

That's it. It does not do much more. Writing a file does not work.

Greets,
Stefan

Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko:
> Roman, initially I had a single process occupying 100% CPU, when sysrq it was 
> indicating as "btrfs_find_space_for_alloc"
> but that's when I used the autodefrag, compress, forcecompress and commit=10 
> mount flags and space_cache was v1 by default.
> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu 
> has dissapeared, but the shite performance remained.
> 
> 
> As to the chunk size, there is no information in the article about the type 
> of data that was used. While in our case we are pretty certain about the 
> compressed block size (32-128). I am currently inclining towards 32k as it 
> might be ideal in a situation when we have a 5 disk raid5 array.
> 
> In theory
> 1. The minimum compressed write (32k) would fill the chunk on a single disk, 
> thus the IO cost of the operation would be 2 reads (original chunk + original 
> parity)  and 2 writes (new chunk + new parity)
> 
> 2. The maximum compressed write (128k) would require the update of 1 chunk on 
> each of the 4 data disks + 1 parity  write 
> 
> 
> 
> Stefan what mount flags do you use?
> 
> kos
> 
> 
> 
> - Original Message -
> From: "Roman Mamedov" <r...@romanrm.net>
> To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com>
> Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" 
> <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" 
> <p...@btrfs.list.sabi.co.uk>
> Sent: Wednesday, 16 August, 2017 2:00:03 PM
> Subject: Re: slow btrfs with a single kworker process using 100% CPU
> 
> On Wed, 16 Aug 2017 12:48:42 +0100 (BST)
> "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote:
> 
>> I believe the chunk size of 512kb is even worth for performance then the 
>> default settings on my HW RAID of  256kb.
> 
> It might be, but that does not explain the original problem reported at all.
> If mdraid performance would be the bottleneck, you would see high iowait,
> possibly some CPU load from the mdX_raidY threads. But not a single Btrfs
> thread pegging into 100% CPU.
> 
>> So now I am moving the data from the array and will be rebuilding it with 64
>> or 32 chunk size and checking the performance.
> 
> 64K is the sweet spot for RAID5/6:
> http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG

Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko:
> Roman, initially I had a single process occupying 100% CPU, when sysrq it was 
> indicating as "btrfs_find_space_for_alloc"
> but that's when I used the autodefrag, compress, forcecompress and commit=10 
> mount flags and space_cache was v1 by default.
> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu 
> has dissapeared, but the shite performance remained.

space_cache=v2 is not supported by the opensuse kernel - but as i
compile the kernel myself anyway. Is there a patchset to add support for
space_cache=v2?

Greets,
Stefan

> 
> As to the chunk size, there is no information in the article about the type 
> of data that was used. While in our case we are pretty certain about the 
> compressed block size (32-128). I am currently inclining towards 32k as it 
> might be ideal in a situation when we have a 5 disk raid5 array.
> 
> In theory
> 1. The minimum compressed write (32k) would fill the chunk on a single disk, 
> thus the IO cost of the operation would be 2 reads (original chunk + original 
> parity)  and 2 writes (new chunk + new parity)
> 
> 2. The maximum compressed write (128k) would require the update of 1 chunk on 
> each of the 4 data disks + 1 parity  write 
> 
> 
> 
> Stefan what mount flags do you use?
> 
> kos
> 
> 
> 
> - Original Message -
> From: "Roman Mamedov" <r...@romanrm.net>
> To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com>
> Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" 
> <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" 
> <p...@btrfs.list.sabi.co.uk>
> Sent: Wednesday, 16 August, 2017 2:00:03 PM
> Subject: Re: slow btrfs with a single kworker process using 100% CPU
> 
> On Wed, 16 Aug 2017 12:48:42 +0100 (BST)
> "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote:
> 
>> I believe the chunk size of 512kb is even worth for performance then the 
>> default settings on my HW RAID of  256kb.
> 
> It might be, but that does not explain the original problem reported at all.
> If mdraid performance would be the bottleneck, you would see high iowait,
> possibly some CPU load from the mdX_raidY threads. But not a single Btrfs
> thread pegging into 100% CPU.
> 
>> So now I am moving the data from the array and will be rebuilding it with 64
>> or 32 chunk size and checking the performance.
> 
> 64K is the sweet spot for RAID5/6:
> http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG

Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko:
> Roman, initially I had a single process occupying 100% CPU, when sysrq it was 
> indicating as "btrfs_find_space_for_alloc"
> but that's when I used the autodefrag, compress, forcecompress and commit=10 
> mount flags and space_cache was v1 by default.
> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu 
> has dissapeared, but the shite performance remained.
> 
> 
> As to the chunk size, there is no information in the article about the type 
> of data that was used. While in our case we are pretty certain about the 
> compressed block size (32-128). I am currently inclining towards 32k as it 
> might be ideal in a situation when we have a 5 disk raid5 array.
> 
> In theory
> 1. The minimum compressed write (32k) would fill the chunk on a single disk, 
> thus the IO cost of the operation would be 2 reads (original chunk + original 
> parity)  and 2 writes (new chunk + new parity)
> 
> 2. The maximum compressed write (128k) would require the update of 1 chunk on 
> each of the 4 data disks + 1 parity  write 
> 
> 
> 
> Stefan what mount flags do you use?

noatime,compress-force=zlib,noacl,space_cache,skip_balance,subvolid=5,subvol=/

Greets,
Stefan


> kos
> 
> 
> 
> - Original Message -
> From: "Roman Mamedov" <r...@romanrm.net>
> To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com>
> Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" 
> <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" 
> <p...@btrfs.list.sabi.co.uk>
> Sent: Wednesday, 16 August, 2017 2:00:03 PM
> Subject: Re: slow btrfs with a single kworker process using 100% CPU
> 
> On Wed, 16 Aug 2017 12:48:42 +0100 (BST)
> "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote:
> 
>> I believe the chunk size of 512kb is even worth for performance then the 
>> default settings on my HW RAID of  256kb.
> 
> It might be, but that does not explain the original problem reported at all.
> If mdraid performance would be the bottleneck, you would see high iowait,
> possibly some CPU load from the mdX_raidY threads. But not a single Btrfs
> thread pegging into 100% CPU.
> 
>> So now I am moving the data from the array and will be rebuilding it with 64
>> or 32 chunk size and checking the performance.
> 
> 64K is the sweet spot for RAID5/6:
> http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG

Am 16.08.2017 um 14:29 schrieb Konstantin V. Gavrilenko:
> Roman, initially I had a single process occupying 100% CPU, when sysrq it was 
> indicating as "btrfs_find_space_for_alloc"
> but that's when I used the autodefrag, compress, forcecompress and commit=10 
> mount flags and space_cache was v1 by default.
> when I switched to "relatime,compress-force=zlib,space_cache=v2" the 100% cpu 
> has dissapeared, but the shite performance remained.
> 
> 
> As to the chunk size, there is no information in the article about the type 
> of data that was used. While in our case we are pretty certain about the 
> compressed block size (32-128). I am currently inclining towards 32k as it 
> might be ideal in a situation when we have a 5 disk raid5 array.
> 
> In theory
> 1. The minimum compressed write (32k) would fill the chunk on a single disk, 
> thus the IO cost of the operation would be 2 reads (original chunk + original 
> parity)  and 2 writes (new chunk + new parity)
> 
> 2. The maximum compressed write (128k) would require the update of 1 chunk on 
> each of the 4 data disks + 1 parity  write 
> 
> 
> 
> Stefan what mount flags do you use?

noatime,compress-force=zlib,noacl,space_cache,skip_balance,subvolid=5,subvol=/

Greets,
Stefan


> kos
> 
> 
> 
> - Original Message -
> From: "Roman Mamedov" <r...@romanrm.net>
> To: "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com>
> Cc: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>, "Marat Khalili" 
> <m...@rqc.ru>, linux-btrfs@vger.kernel.org, "Peter Grandi" 
> <p...@btrfs.list.sabi.co.uk>
> Sent: Wednesday, 16 August, 2017 2:00:03 PM
> Subject: Re: slow btrfs with a single kworker process using 100% CPU
> 
> On Wed, 16 Aug 2017 12:48:42 +0100 (BST)
> "Konstantin V. Gavrilenko" <k.gavrile...@arhont.com> wrote:
> 
>> I believe the chunk size of 512kb is even worth for performance then the 
>> default settings on my HW RAID of  256kb.
> 
> It might be, but that does not explain the original problem reported at all.
> If mdraid performance would be the bottleneck, you would see high iowait,
> possibly some CPU load from the mdX_raidY threads. But not a single Btrfs
> thread pegging into 100% CPU.
> 
>> So now I am moving the data from the array and will be rebuilding it with 64
>> or 32 chunk size and checking the performance.
> 
> 64K is the sweet spot for RAID5/6:
> http://louwrentius.com/linux-raid-level-and-chunk-size-the-benchmarks.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG
Am 16.08.2017 um 11:02 schrieb Konstantin V. Gavrilenko:
> Could be similar issue as what I had recently, with the RAID5 and 256kb chunk 
> size.
> please provide more information about your RAID setup.

Hope this helps:

# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath]
[raid0] [raid10]
md0 : active raid5 sdd1[1] sdf1[4] sdc1[0] sde1[2]
  11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2
[4/4] []
  bitmap: 6/30 pages [24KB], 65536KB chunk

md2 : active raid5 sdm1[2] sdl1[1] sdk1[0] sdn1[4]
  11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2
[4/4] []
  bitmap: 7/30 pages [28KB], 65536KB chunk

md1 : active raid5 sdi1[2] sdg1[0] sdj1[4] sdh1[1]
  11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2
[4/4] []
  bitmap: 7/30 pages [28KB], 65536KB chunk

md3 : active raid5 sdp1[1] sdo1[0] sdq1[2] sdr1[4]
  11717406720 blocks super 1.2 level 5, 512k chunk, algorithm 2
[4/4] []
  bitmap: 6/30 pages [24KB], 65536KB chunk

# btrfs fi usage /vmbackup/
Overall:
Device size:  43.65TiB
Device allocated: 31.98TiB
Device unallocated:   11.67TiB
Device missing:  0.00B
Used: 30.80TiB
Free (estimated): 12.84TiB  (min: 12.84TiB)
Data ratio:   1.00
Metadata ratio:   1.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID0: Size:31.83TiB, Used:30.66TiB
   /dev/md07.96TiB
   /dev/md17.96TiB
   /dev/md27.96TiB
   /dev/md37.96TiB

Metadata,RAID0: Size:153.00GiB, Used:141.34GiB
   /dev/md0   38.25GiB
   /dev/md1   38.25GiB
   /dev/md2   38.25GiB
   /dev/md3   38.25GiB

System,RAID0: Size:128.00MiB, Used:2.28MiB
   /dev/md0   32.00MiB
   /dev/md1   32.00MiB
   /dev/md2   32.00MiB
   /dev/md3   32.00MiB

Unallocated:
   /dev/md02.92TiB
   /dev/md12.92TiB
   /dev/md22.92TiB
   /dev/md32.92TiB


Stefan

> 
> p.s.
> you can also check the tread "Btrfs + compression = slow performance and high 
> cpu usage"
> 
> - Original Message -
> From: "Stefan Priebe - Profihost AG" <s.pri...@profihost.ag>
> To: "Marat Khalili" <m...@rqc.ru>, linux-btrfs@vger.kernel.org
> Sent: Wednesday, 16 August, 2017 10:37:43 AM
> Subject: Re: slow btrfs with a single kworker process using 100% CPU
> 
> Am 16.08.2017 um 08:53 schrieb Marat Khalili:
>>> I've one system where a single kworker process is using 100% CPU
>>> sometimes a second process comes up with 100% CPU [btrfs-transacti]. Is
>>> there anything i can do to get the old speed again or find the culprit?
>>
>> 1. Do you use quotas (qgroups)?
> 
> No qgroups and no quota.
> 
>> 2. Do you have a lot of snapshots? Have you deleted some recently?
> 
> 1413 Snapshots. I'm deleting 50 of them every night. But btrfs-cleaner
> process isn't running / consuming CPU currently.
> 
>> More info about your system would help too.
> Kernel is OpenSuSE Leap 42.3.
> 
> btrfs is mounted with
> compress-force=zlib
> 
> btrfs is running as a raid0 on top of 4 md raid 5 devices.
> 
> Greets,
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG
Am 16.08.2017 um 08:53 schrieb Marat Khalili:
>> I've one system where a single kworker process is using 100% CPU
>> sometimes a second process comes up with 100% CPU [btrfs-transacti]. Is
>> there anything i can do to get the old speed again or find the culprit?
> 
> 1. Do you use quotas (qgroups)?

No qgroups and no quota.

> 2. Do you have a lot of snapshots? Have you deleted some recently?

1413 Snapshots. I'm deleting 50 of them every night. But btrfs-cleaner
process isn't running / consuming CPU currently.

> More info about your system would help too.
Kernel is OpenSuSE Leap 42.3.

btrfs is mounted with
compress-force=zlib

btrfs is running as a raid0 on top of 4 md raid 5 devices.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Stefan Priebe - Profihost AG
Hello,

I've one system where a single kworker process is using 100% CPU
sometimes a second process comes up with 100% CPU [btrfs-transacti]. Is
there anything i can do to get the old speed again or find the culprit?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: runtime btrfsck

2017-05-10 Thread Stefan Priebe - Profihost AG
Hello,

thanks. But is there any way to recover from this error? Like removing
the item or so? Data loss isn't a problem. Just reconstructing the whole
FS will take quite a long time.

Stefan

Am 10.05.2017 um 11:54 schrieb Hugo Mills:
> On Wed, May 10, 2017 at 11:20:58AM +0200, Stefan Priebe - Profihost AG wrote:
>> Hello,
>>
>> here's the output:
>> # for block in 163316514816 163322413056 163325722624; do echo $block;
>> btrfs-debug-tree -b $block /dev/mapper/crypt_md0|sed -re 's/(\t| )name:
>> .*/\1name: HIDDEN/'; done
>>
>> 163316514816
>> btrfs-progs v4.8.5
>> leaf 163316514816 items 188 free space 1387 generation 86739 owner 3892
>> fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
>> chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f
> [...]
>> item 37 key (23760 DIR_INDEX 36) itemoff 14278 itemsize 58
>> location key (28124232 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 28
>> name: HIDDEN
>> item 38 key (23760 DIR_INDEX 37) itemoff 14220 itemsize 58
>> location key (28124233 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 28
>> name: HIDDEN
>> item 39 key (23760 DIR_INDEX 38) itemoff 14165 itemsize 55
>> location key (28124234 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 25
>> name: HIDDEN
>> item 40 key (23760 DIR_INDEX 22) itemoff 14115 itemsize 50
>> location key (26923383 INODE_ITEM 0) type FILE
>> transid 74009 data_len 0 name_len 20
>> name: HIDDEN
>> item 41 key (23760 DIR_INDEX 23) itemoff 14067 itemsize 48
>> location key (26923384 INODE_ITEM 0) type FILE
>> transid 74009 data_len 0 name_len 18
>> name: HIDDEN
> [...]
>> 163322413056
>> btrfs-progs v4.8.5
>> leaf 163322413056 items 113 free space 934 generation 86739 owner 3892
>> fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
>> chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f
> [...]
>> item 73 key (24016 DIR_INDEX 19) itemoff 9651 itemsize 62
>> location key (28124251 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 32
>> name: HIDDEN
>> item 74 key (24016 DIR_INDEX 20) itemoff 9592 itemsize 59
>> location key (28124252 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 29
>> name: HIDDEN
>> item 75 key (24016 DIR_INDEX 4) itemoff 9538 itemsize 54
>> location key (26923401 INODE_ITEM 0) type FILE
>> transid 74009 data_len 0 name_len 24
>> name: HIDDEN
>> item 76 key (24016 DIR_INDEX 5) itemoff 9486 itemsize 52
>> location key (26923402 INODE_ITEM 0) type FILE
>> transid 74009 data_len 0 name_len 22
>> name: HIDDEN
> [...]
>> 163325722624
>> btrfs-progs v4.8.5
>> leaf 163325722624 items 78 free space 6563 generation 86739 owner 3892
>> fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
>> chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f
> [...]
>> item 62 key (24189 DIR_INDEX 16) itemoff 9409 itemsize 64
>> location key (28124267 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 34
>> name: HIDDEN
>> item 63 key (24189 DIR_INDEX 17) itemoff 9349 itemsize 60
>> location key (28124268 INODE_ITEM 0) type FILE
>> transid 86739 data_len 0 name_len 30
>> name: HIDDEN
>> item 64 key (24189 DIR_INDEX 4) itemoff 9296 itemsize 53
>> location key (26923421 INODE_ITEM 0) type FILE
>> transid 74010 data_len 0 name_len 23
>> name: HIDDEN
>> item 65 key (24189 DIR_INDEX 5) itemoff 9236 itemsize 60
>> location key (26923422 INODE_ITEM 0) type FILE
>> transid 74010 data_len 0 name_len 30
>> name: HIDDEN
> [...]
> 
>In each case, the tree node keys have simply been sorted
> incorrectly -- the ordering is otherwise correct, but jumps backwards
> at some point in the sequence. At least in the first instance, some of
> the keys appear to have been duplicated: there are two (23760
> DIR_INDEX 22) keys in the list. (I didn't check in detail with the
> other two whether there are duplicates, but I would

Re: runtime btrfsck

2017-05-10 Thread Stefan Priebe - Profihost AG
Hello,

here's the output:
# for block in 163316514816 163322413056 163325722624; do echo $block;
btrfs-debug-tree -b $block /dev/mapper/crypt_md0|sed -re 's/(\t| )name:
.*/\1name: HIDDEN/'; done

163316514816
btrfs-progs v4.8.5
leaf 163316514816 items 188 free space 1387 generation 86739 owner 3892
fs uuid 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
chunk uuid b86efe94-ab40-4344-ac6b-46ec59c41b8f
item 0 key (23760 DIR_ITEM 2479948887) itemoff 16229 itemsize 54
location key (26923382 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 24
name: HIDDEN
item 1 key (23760 DIR_ITEM 2652742785) itemoff 16170 itemsize 59
location key (28124230 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 29
name: HIDDEN
item 2 key (23760 DIR_ITEM 2688971413) itemoff 16119 itemsize 51
location key (26923386 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 21
name: HIDDEN
item 3 key (23760 DIR_ITEM 2764880658) itemoff 16064 itemsize 55
location key (26923399 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 25
name: HIDDEN
item 4 key (23760 DIR_ITEM 2805527189) itemoff 16006 itemsize 58
location key (28124233 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 28
name: HIDDEN
item 5 key (23760 DIR_ITEM 2876464375) itemoff 15957 itemsize 49
location key (26923393 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 19
name: HIDDEN
item 6 key (23760 DIR_ITEM 2951059296) itemoff 15907 itemsize 50
location key (28124218 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 20
name: HIDDEN
item 7 key (23760 DIR_ITEM 3058144963) itemoff 15859 itemsize 48
location key (26923384 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 18
name: HIDDEN
item 8 key (23760 DIR_ITEM 3095440808) itemoff 15804 itemsize 55
location key (26923394 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 25
name: HIDDEN
item 9 key (23760 DIR_ITEM 3124573416) itemoff 15748 itemsize 56
location key (26923387 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 26
name: HIDDEN
item 10 key (23760 DIR_ITEM 3194204932) itemoff 15690 itemsize 58
location key (26923397 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 28
name: HIDDEN
item 11 key (23760 DIR_ITEM 3281114395) itemoff 15637 itemsize 53
location key (26923388 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 23
name: HIDDEN
item 12 key (23760 DIR_ITEM 3353597736) itemoff 15588 itemsize 49
location key (24944 INODE_ITEM 0) type FILE
transid 10694 data_len 0 name_len 19
name: HIDDEN
item 13 key (23760 DIR_ITEM 3389003195) itemoff 15539 itemsize 49
location key (28124226 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 19
name: HIDDEN
item 14 key (23760 DIR_ITEM 3461310858) itemoff 15473 itemsize 66
location key (26923392 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 36
name: HIDDEN
item 15 key (23760 DIR_ITEM 3660173809) itemoff 15422 itemsize 51
location key (28124225 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 21
name: HIDDEN
item 16 key (23760 DIR_ITEM 3678308711) itemoff 15371 itemsize 51
location key (28124220 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 21
name: HIDDEN
item 17 key (23760 DIR_ITEM 3708519009) itemoff 15316 itemsize 55
location key (28124224 INODE_ITEM 0) type FILE
transid 86739 data_len 0 name_len 25
name: HIDDEN
item 18 key (23760 DIR_ITEM 3716314603) itemoff 15258 itemsize 58
location key (26923396 INODE_ITEM 0) type FILE
transid 74009 data_len 0 name_len 28
name: HIDDEN
item 19 key (23760 DIR_ITEM 3958443109) itemoff 15224 itemsize 34
location key (24016 INODE_ITEM 0) type DIR
transid 10693 data_len 0 name_len 4
name: HIDDEN
item 20 key (23760 DIR_INDEX 2) itemoff 15190 itemsize 34
location key (24016 INODE_ITEM 0) type DIR
transid 10693 data_len 0 name_len 4
name: HIDDEN
item 21 key (23760 DIR_INDEX 

Re: runtime btrfsck

2017-05-10 Thread Stefan Priebe - Profihost AG
Hi,
Am 10.05.2017 um 09:48 schrieb Martin Steigerwald:
> Stefan Priebe - Profihost AG - 10.05.17, 09:02:
>> I'm now trying btrfs progs 4.10.2. Is anybody out there who can tell me
>> something about the expected runtime or how to fix bad key ordering?
> 
> I had a similar issue which remained unresolved.
> But I clearly saw that btrfs check was running in a loop, see thread:
> [4.9] btrfs check --repair looping over file extent discount errors
> 
> So it would be interesting to see the exact output of btrfs check, maybe 
> there 
> is something like repeated numbers that also indicate a loop.

Output is just:
enabling repair mode
Checking filesystem on /dev/mapper/crypt_md0
UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
bad key ordering 39 40
checking extents [.]

even after 2,5 weeks running.

Stefan

> I was about to say that BTRFS is production ready before this issue happened. 
> I still think it for a lot of setup mostly is, as at least the "I get stuck 
> on 
> the CPU while searching for free space" issue seems to be gone since about 
> anything between 4.5/4.6 kernels. I also think so regarding absence of data 
> loss. I was able to copy over all of the data I needed of the broken 
> filesystem.
> 
> Yet, when it comes to btrfs check? Its still quite rudimentary if you ask me. 
>  
> So unless someone has a clever idea here and shares it with you, it may be 
> needed to backup anything you can from this filesystem and then start over 
> from 
> scratch. As to my past experience something like xfs_repair surpasses btrfs 
> check in the ability to actually fix broken filesystem by a great extent.
> 
> Ciao,
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: runtime btrfsck

2017-05-10 Thread Stefan Priebe - Profihost AG
Am 10.05.2017 um 09:40 schrieb Hugo Mills:
> On Wed, May 10, 2017 at 09:36:30AM +0200, Stefan Priebe - Profihost AG wrote:
>> Hello Roman,
>>
>> the FS is mountable. It just goes readonly when trying to write some data.
>>
>> The kernel msgs are:
>> BTRFS critical (device dm-2): corrupt leaf, bad key order:
>> block=163316514816,root=1, slot=39
>> BTRFS critical (device dm-2): corrupt leaf, bad key order:
>> block=163322413056,root=1, slot=74
>> BTRFS critical (device dm-2): corrupt leaf, bad key order:
>> block=163325722624,root=1, slot=63
>> BTRFS critical (device dm-2): corrupt leaf, bad key order:
>> block=163316514816,root=1, slot=39
>> BTRFS: error (device dm-2) in btrfs_drop_snapshot:8839: errno=-5 IO failure
>> BTRFS info (device dm-2): forced readonly
>> BTRFS info (device dm-2): delayed_refs has NO entry
> 
>Can you show the output of btrfs-debug-tree -b , where
>  is each of the three "block=" values above?

Can do that. But the lists are very long - should i upload them to
pastebin? Is it ok to remove the name atribute - which provides filenames?

Stefan


>Hugo.
> 
>> Greets,
>> Stefan
>> Am 10.05.2017 um 09:18 schrieb Roman Mamedov:
>>> On Wed, 10 May 2017 09:02:46 +0200
>>> Stefan Priebe - Profihost AG <s.pri...@profihost.ag> wrote:
>>>
>>>> how to fix bad key ordering?
>>>
>>> You should clarify does the FS in question mount (read-write? read-only?)
>>> and what are the kernel messages if it does not.
>>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: runtime btrfsck

2017-05-10 Thread Stefan Priebe - Profihost AG
Hello Roman,

the FS is mountable. It just goes readonly when trying to write some data.

The kernel msgs are:
BTRFS critical (device dm-2): corrupt leaf, bad key order:
block=163316514816,root=1, slot=39
BTRFS critical (device dm-2): corrupt leaf, bad key order:
block=163322413056,root=1, slot=74
BTRFS critical (device dm-2): corrupt leaf, bad key order:
block=163325722624,root=1, slot=63
BTRFS critical (device dm-2): corrupt leaf, bad key order:
block=163316514816,root=1, slot=39
BTRFS: error (device dm-2) in btrfs_drop_snapshot:8839: errno=-5 IO failure
BTRFS info (device dm-2): forced readonly
BTRFS info (device dm-2): delayed_refs has NO entry

Greets,
Stefan
Am 10.05.2017 um 09:18 schrieb Roman Mamedov:
> On Wed, 10 May 2017 09:02:46 +0200
> Stefan Priebe - Profihost AG <s.pri...@profihost.ag> wrote:
> 
>> how to fix bad key ordering?
> 
> You should clarify does the FS in question mount (read-write? read-only?)
> and what are the kernel messages if it does not.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: runtime btrfsck

2017-05-10 Thread Stefan Priebe - Profihost AG
I'm now trying btrfs progs 4.10.2. Is anybody out there who can tell me
something about the expected runtime or how to fix bad key ordering?

Greets,
Stefan

Am 06.05.2017 um 07:56 schrieb Stefan Priebe - Profihost AG:
> It's still running. Is this the normal behaviour? Is there any other way
> to fix the bad key ordering?
> 
> Greets,
> Stefan
> 
> Am 02.05.2017 um 08:29 schrieb Stefan Priebe - Profihost AG:
>> Hello list,
>>
>> i wanted to check an fs cause it has bad key ordering.
>>
>> But btrfscheck is now running since 7 days. Current output:
>> # btrfsck -p --repair /dev/mapper/crypt_md0
>> enabling repair mode
>> Checking filesystem on /dev/mapper/crypt_md0
>> UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
>> bad key ordering 39 40
>> checking extents [O]
>>
>> FS is a 12TB BTRFS Raid 0 over 3 mdadm Raid 5 devices. How long should
>> btrfsck run and is there any way to speed it up? btrfs tools is 4.8.5
>>
>> Thanks!
>>
>> Greets,
>> Stefan
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: runtime btrfsck

2017-05-05 Thread Stefan Priebe - Profihost AG
It's still running. Is this the normal behaviour? Is there any other way
to fix the bad key ordering?

Greets,
Stefan

Am 02.05.2017 um 08:29 schrieb Stefan Priebe - Profihost AG:
> Hello list,
> 
> i wanted to check an fs cause it has bad key ordering.
> 
> But btrfscheck is now running since 7 days. Current output:
> # btrfsck -p --repair /dev/mapper/crypt_md0
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_md0
> UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
> bad key ordering 39 40
> checking extents [O]
> 
> FS is a 12TB BTRFS Raid 0 over 3 mdadm Raid 5 devices. How long should
> btrfsck run and is there any way to speed it up? btrfs tools is 4.8.5
> 
> Thanks!
> 
> Greets,
> Stefan
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


runtime btrfsck

2017-05-02 Thread Stefan Priebe - Profihost AG
Hello list,

i wanted to check an fs cause it has bad key ordering.

But btrfscheck is now running since 7 days. Current output:
# btrfsck -p --repair /dev/mapper/crypt_md0
enabling repair mode
Checking filesystem on /dev/mapper/crypt_md0
UUID: 37b15dd8-b2e1-4585-98d0-cc0fa2a5a7c9
bad key ordering 39 40
checking extents [O]

FS is a 12TB BTRFS Raid 0 over 3 mdadm Raid 5 devices. How long should
btrfsck run and is there any way to speed it up? btrfs tools is 4.8.5

Thanks!

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues

2017-04-25 Thread Stefan Priebe - Profihost AG
Hello Qu,

still noone on this one? Or is this one solved in another way in 4.10 or
4.11 or is compression just experimental? Haven't seen a note on this.

Thanks,
Stefan

Am 27.02.2017 um 14:43 schrieb Stefan Priebe - Profihost AG:
> Hi,
> 
> can please anybody comment on that one? Josef? Chris? I still need those
> patches to be able to let btrfs run for more than 24hours without ENOSPC
> issues.
> 
> Greets,
> Stefan
> 
> Am 27.02.2017 um 08:22 schrieb Qu Wenruo:
>>
>>
>> At 02/25/2017 04:23 PM, Stefan Priebe - Profihost AG wrote:
>>> Dear Qu,
>>>
>>> any news on your branch? I still don't see it merged anywhere.
>>>
>>> Greets,
>>> Stefan
>>
>> I just remember that Liu Bo has commented one of the patches, I'm afraid
>> I can only push these patches until I addressed his concern.
>>
>> I'll start digging it as memory for this fix is quite blurred now.
>>
>> Thanks,
>> Qu
>>>
>>> Am 04.01.2017 um 17:13 schrieb Stefan Priebe - Profihost AG:
>>>> Hi Qu,
>>>>
>>>> Am 01.01.2017 um 10:32 schrieb Qu Wenruo:
>>>>> Hi Stefan,
>>>>>
>>>>> I'm trying to push it to for-next (will be v4.11), but no response yet.
>>>>>
>>>>> It would be quite nice for you to test the following git pull and give
>>>>> some feedback, so that we can merge it faster.
>>>>>
>>>>> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html
>>>>
>>>> I'm also getting a notifier that wang's email does not exist anymore
>>>> (wangxg.f...@cn.fujitsu.com).
>>>>
>>>> I would like to test that branch will need some time todo so. Last time
>>>> i tried 4.10-rc1 i had the same problems like this guy:
>>>> https://www.marc.info/?l=linux-btrfs=148338312525137=2
>>>>
>>>> Stefan
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote:
>>>>>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc
>>>>>>
>>>>>> Stefan
>>>>>>
>>>>>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang:
>>>>>>> When having compression enabled, Stefan Priebe ofen got enospc errors
>>>>>>> though fs still has much free space. Qu Wenruo also has submitted a
>>>>>>> fstests test case which can reproduce this bug steadily, please see
>>>>>>> url: https://patchwork.kernel.org/patch/9420527
>>>>>>>
>>>>>>> First patch[1/3] "btrfs: improve inode's outstanding_extents
>>>>>>> computation" is to
>>>>>>> fix outstanding_extents and reserved_extents account issues. This
>>>>>>> issue was revealed
>>>>>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying
>>>>>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these
>>>>>>> warnings from
>>>>>>> btrfs_destroy_inode():
>>>>>>> WARN_ON(BTRFS_I(inode)->outstanding_extents);
>>>>>>> WARN_ON(BTRFS_I(inode)->reserved_extents);
>>>>>>> Please see this patch's commit message for detailed info, and this
>>>>>>> patch is
>>>>>>> necessary to patch2 and patch3.
>>>>>>>
>>>>>>> For false enospc, the root reasson is that for compression, its max
>>>>>>> extent size will
>>>>>>> be 128k, not 128MB. If we still use 128MB as max extent size to
>>>>>>> reserve metadata for
>>>>>>> compression, obviously it's not appropriate. In patch "btrfs:
>>>>>>> Introduce COMPRESS
>>>>>>> reserve type to fix false enospc for compression" commit message,
>>>>>>> we explain why false enospc error occurs, please see it for detailed
>>>>>>> info.
>>>>>>>
>>>>>>> To fix this issue, we introduce a new enum type:
>>>>>>> enum btrfs_metadata_reserve_type {
>>>>>>> BTRFS_RESERVE_NORMAL,
>>>>>>> BTRFS_RESERVE_COMPRESS,
>>>>>>> };
>>>>>>> For btrfs_delalloc_[reserve|release]_metadata() and
>>>>>>

Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error

2017-03-14 Thread Stefan Priebe - Profihost AG
Thanks Qu, removing BTRFS_I from the inode fixes this issue to me.

Greets,
Stefan


Am 14.03.2017 um 03:50 schrieb Qu Wenruo:
> 
> 
> At 03/13/2017 09:26 PM, Stefan Priebe - Profihost AG wrote:
>>
>> Am 13.03.2017 um 08:39 schrieb Qu Wenruo:
>>>
>>>
>>> At 03/13/2017 03:26 PM, Stefan Priebe - Profihost AG wrote:
>>>> Hi Qu,
>>>>
>>>> Am 13.03.2017 um 02:16 schrieb Qu Wenruo:
>>>>
>>>> But wasn't this part of the code identical in V5? Why does it only
>>>> happen with V7?
>>>
>>> There are still difference, but just as you said, the related
>>> part(checking if inode is free space cache inode) is identical across v5
>>> and v7.
>>
>> But if i boot v7 it always happens. If i boot v5 it always works. Have
>> done 5 repeatet tests.
> 
> I rechecked the code change between v7 and v5.
> 
> It turns out that, the code base may cause the problem.
> 
> In v7, the base is v4.11-rc1, which introduced quite a lot of
> btrfs_inode cleanup.
> 
> One of the difference is the parameter for btrfs_is_free_space_inode().
> 
> In v7, the parameter @inode changed from struct inode to struct
> btrfs_inode.
> 
> So in v7, we're passing BTRFS_I(inode) to btrfs_is_free_space_inode(),
> other than plain inode.
> 
> That's the most possible cause for me here.
> 
> So would you please paste the final patch applied to your tree?
> Git diff or git format-patch can both handle it.
> 
> Thanks,
> Qu
> 
>>
>>> I'm afraid that's a rare race leading to NULL btrfs_inode->root, which
>>> could happen in both v5 and v7.
>>>
>>> What's the difference between SUSE and mainline kernel?
>>
>> A lot ;-) But i don't think anything related.
>>
>>> Maybe some mainline kernel commits have already fixed it?
>>
>> May be no idea. But i haven't found any reason why v5 works.
>>
>> Stefan
>>
>>>
>>> Thanks,
>>> Qu
>>>>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error

2017-03-13 Thread Stefan Priebe - Profihost AG

Am 13.03.2017 um 08:39 schrieb Qu Wenruo:
> 
> 
> At 03/13/2017 03:26 PM, Stefan Priebe - Profihost AG wrote:
>> Hi Qu,
>>
>> Am 13.03.2017 um 02:16 schrieb Qu Wenruo:
>>>
>>> At 03/13/2017 04:49 AM, Stefan Priebe - Profihost AG wrote:
>>>> Hi Qu,
>>>>
>>>> while V5 was running fine against the openSUSE-42.2 kernel (based on
>>>> v4.4).
>>>
>>> Thanks for the test.
>>>
>>>> V7 results in OOPS to me:
>>>> BUG: unable to handle kernel NULL pointer dereference at
>>>> 01f0
>>>
>>> This 0x1f0 is the same as offsetof(struct brrfs_root, fs_info), quite
>>> nice clue.
>>>
>>>> IP: [] __endio_write_update_ordered+0x33/0x140
>>>> [btrfs]
>>>
>>> IP points to:
>>> ---
>>> static inline bool btrfs_is_free_space_inode(struct btrfs_inode *inode)
>>> {
>>> struct btrfs_root *root = inode->root; << Either here
>>>
>>> if (root == root->fs_info->tree_root && << Or here
>>> btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID)
>>>
>>> ---
>>>
>>> Taking the above offset into consideration, it's only possible for later
>>> case.
>>>
>>> So here, we have a btrfs_inode whose @root is NULL.
>>
>> But wasn't this part of the code identical in V5? Why does it only
>> happen with V7?
> 
> There are still difference, but just as you said, the related
> part(checking if inode is free space cache inode) is identical across v5
> and v7.

But if i boot v7 it always happens. If i boot v5 it always works. Have
done 5 repeatet tests.

> I'm afraid that's a rare race leading to NULL btrfs_inode->root, which
> could happen in both v5 and v7.
> 
> What's the difference between SUSE and mainline kernel?

A lot ;-) But i don't think anything related.

> Maybe some mainline kernel commits have already fixed it?

May be no idea. But i haven't found any reason why v5 works.

Stefan

> 
> Thanks,
> Qu
>>
>>> This can be fixed easily by checking @root inside
>>> btrfs_is_free_space_inode(), as the backtrace shows that it's only
>>> happening for DirectIO, and it won't happen for free space cache inode.
>>>
>>> But I'm more curious how this happened for a more accurate fix, or we
>>> could have other NULL pointer access.
>>>
>>> Did you have any reproducer for this?
>>
>> Sorry no - this is a production MariaDB Server running btrfs with
>> compress-force=zlib. But if i could test anything i'll do.
>>
>> Greets,
>> Stefan
>>
>>>
>>> Thanks,
>>> Qu
>>>
>>>> PGD 14e18d4067 PUD 14e1868067 PMD 0
>>>> Oops:  [#1] SMP
>>>> Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4
>>>> xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set
>>>> nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq
>>>> ata_generic
>>>> virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci
>>>> i2c_core usb_common ata_piix floppy
>>>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1
>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>>> 1.7.5-20140722_172050-sagunt 04/01/2014
>>>> task: b4e0f500 ti: b4e0 task.ti: b4e0
>>>> RIP: 0010:[] []
>>>> __endio_write_update_ordered+0x33/0x140 [btrfs]
>>>> RSP: 0018:8814eae03cd8 EFLAGS: 00010086
>>>> RAX:  RBX: 8814e8fd5aa8 RCX: 0001
>>>> RDX: 0010 RSI: 0010 RDI: 8814e45885c0
>>>> RBP: 8814eae03d10 R08: 8814e8334000 R09: 00018040003a
>>>> R10: ea00507d8d00 R11: 88141f634080 R12: 8814e45885c0
>>>> R13: 8814e125d700 R14: 0010 R15: 8800376c6a80
>>>> FS: () GS:8814eae0()
>>>> knlGS:
>>>> CS: 0010 DS:  ES:  CR0: 80050033
>>>> CR2: 01f0 CR3: 0014e34c9000 CR4: 001406f0Stack:
>>>>  0010 8814e8fd5aa8 8814e953f3c0
>>>> 8814e125d700 0010 8800376c6a80 8814eae03d38
>>>> c03ddf67 8814e86b6a80 8814e8fd5aa8 0001
>>>> Call Trace:
>>>> [] btrfs_endio_direct_write+0x37/0x60 [btrfs]
>>>> [] bio_

Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error

2017-03-13 Thread Stefan Priebe - Profihost AG
Hi Qu,

Am 13.03.2017 um 02:16 schrieb Qu Wenruo:
> 
> At 03/13/2017 04:49 AM, Stefan Priebe - Profihost AG wrote:
>> Hi Qu,
>>
>> while V5 was running fine against the openSUSE-42.2 kernel (based on
>> v4.4).
> 
> Thanks for the test.
> 
>> V7 results in OOPS to me:
>> BUG: unable to handle kernel NULL pointer dereference at 01f0
> 
> This 0x1f0 is the same as offsetof(struct brrfs_root, fs_info), quite
> nice clue.
> 
>> IP: [] __endio_write_update_ordered+0x33/0x140 [btrfs]
> 
> IP points to:
> ---
> static inline bool btrfs_is_free_space_inode(struct btrfs_inode *inode)
> {
> struct btrfs_root *root = inode->root; << Either here
> 
> if (root == root->fs_info->tree_root && << Or here
> btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID)
> 
> ---
> 
> Taking the above offset into consideration, it's only possible for later
> case.
> 
> So here, we have a btrfs_inode whose @root is NULL.

But wasn't this part of the code identical in V5? Why does it only
happen with V7?

> This can be fixed easily by checking @root inside
> btrfs_is_free_space_inode(), as the backtrace shows that it's only
> happening for DirectIO, and it won't happen for free space cache inode.
> 
> But I'm more curious how this happened for a more accurate fix, or we
> could have other NULL pointer access.
> 
> Did you have any reproducer for this?

Sorry no - this is a production MariaDB Server running btrfs with
compress-force=zlib. But if i could test anything i'll do.

Greets,
Stefan

> 
> Thanks,
> Qu
> 
>> PGD 14e18d4067 PUD 14e1868067 PMD 0
>> Oops:  [#1] SMP
>> Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4
>> xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set
>> nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq ata_generic
>> virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci
>> i2c_core usb_common ata_piix floppy
>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> 1.7.5-20140722_172050-sagunt 04/01/2014
>> task: b4e0f500 ti: b4e0 task.ti: b4e0
>> RIP: 0010:[] []
>> __endio_write_update_ordered+0x33/0x140 [btrfs]
>> RSP: 0018:8814eae03cd8 EFLAGS: 00010086
>> RAX:  RBX: 8814e8fd5aa8 RCX: 0001
>> RDX: 0010 RSI: 0010 RDI: 8814e45885c0
>> RBP: 8814eae03d10 R08: 8814e8334000 R09: 00018040003a
>> R10: ea00507d8d00 R11: 88141f634080 R12: 8814e45885c0
>> R13: 8814e125d700 R14: 0010 R15: 8800376c6a80
>> FS: () GS:8814eae0()
>> knlGS:
>> CS: 0010 DS:  ES:  CR0: 80050033
>> CR2: 01f0 CR3: 0014e34c9000 CR4: 001406f0Stack:
>>  0010 8814e8fd5aa8 8814e953f3c0
>> 8814e125d700 0010 8800376c6a80 8814eae03d38
>> c03ddf67 8814e86b6a80 8814e8fd5aa8 0001
>> Call Trace:
>> [] btrfs_endio_direct_write+0x37/0x60 [btrfs]
>> [] bio_endio+0x57/0x60
>> [] btrfs_end_bio+0xa1/0x140 [btrfs]
>> [] bio_endio+0x57/0x60
>> [] blk_update_request+0x8b/0x330
>> [] blk_mq_end_request+0x1a/0x70
>> [] virtblk_request_done+0x3f/0x70 [virtio_blk]
>> [] __blk_mq_complete_request+0x78/0xe0
>> [] blk_mq_complete_request+0x1c/0x20
>> [] virtblk_done+0x64/0xe0 [virtio_blk]
>> [] vring_interrupt+0x3a/0x90
>> [] __handle_irq_event_percpu+0x89/0x1b0
>> [] handle_irq_event_percpu+0x23/0x60
>> [] handle_irq_event+0x3b/0x60
>> [] handle_edge_irq+0x6f/0x150
>> [] handle_irq+0x1d/0x30
>> [] do_IRQ+0x4b/0xd0
>> [] common_interrupt+0x8c/0x8c
>> DWARF2 unwinder stuck at ret_from_intr+0x0/0x1b
>> Leftover inexact backtrace:
>> 2017-03-12 20:33:08 
>> 2017-03-12 20:33:08  [] ? native_safe_halt+0x6/0x10
>> [] default_idle+0x1e/0xe0
>> [] arch_cpu_idle+0xf/0x20
>> [] default_idle_call+0x3b/0x40
>> [] cpu_startup_entry+0x29a/0x370
>> [] rest_init+0x7c/0x80
>> [] start_kernel+0x490/0x49d
>> [] ? early_idt_handler_array+0x120/0x120
>> [] x86_64_start_reservations+0x2a/0x2c
>> [] x86_64_start_kernel+0x13b/0x14a
>> Code: e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 48 8b 87 70 fc
>> ff ff 4c 8b 87 38 fe ff ff 48 c7 45 c8 00 00 00 00 48 89 75 d0 <48> 8b
>> b8 f0 01 00 00 48 3b 47 28 49 8b 84 24 78 fc ff ff 0f 84
>> RIP [] __endio_write_upd

[PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error

2017-03-12 Thread Stefan Priebe - Profihost AG
Hi Qu,

while V5 was running fine against the openSUSE-42.2 kernel (based on v4.4).

V7 results in OOPS to me:
BUG: unable to handle kernel NULL pointer dereference at 01f0
IP: [] __endio_write_update_ordered+0x33/0x140 [btrfs]
PGD 14e18d4067 PUD 14e1868067 PMD 0
Oops:  [#1] SMP
Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4
xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set
nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq ata_generic
virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci
i2c_core usb_common ata_piix floppy
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.7.5-20140722_172050-sagunt 04/01/2014
task: b4e0f500 ti: b4e0 task.ti: b4e0
RIP: 0010:[] []
__endio_write_update_ordered+0x33/0x140 [btrfs]
RSP: 0018:8814eae03cd8 EFLAGS: 00010086
RAX:  RBX: 8814e8fd5aa8 RCX: 0001
RDX: 0010 RSI: 0010 RDI: 8814e45885c0
RBP: 8814eae03d10 R08: 8814e8334000 R09: 00018040003a
R10: ea00507d8d00 R11: 88141f634080 R12: 8814e45885c0
R13: 8814e125d700 R14: 0010 R15: 8800376c6a80
FS: () GS:8814eae0() knlGS:
CS: 0010 DS:  ES:  CR0: 80050033
CR2: 01f0 CR3: 0014e34c9000 CR4: 001406f0Stack:
 0010 8814e8fd5aa8 8814e953f3c0
8814e125d700 0010 8800376c6a80 8814eae03d38
c03ddf67 8814e86b6a80 8814e8fd5aa8 0001
Call Trace:
[] btrfs_endio_direct_write+0x37/0x60 [btrfs]
[] bio_endio+0x57/0x60
[] btrfs_end_bio+0xa1/0x140 [btrfs]
[] bio_endio+0x57/0x60
[] blk_update_request+0x8b/0x330
[] blk_mq_end_request+0x1a/0x70
[] virtblk_request_done+0x3f/0x70 [virtio_blk]
[] __blk_mq_complete_request+0x78/0xe0
[] blk_mq_complete_request+0x1c/0x20
[] virtblk_done+0x64/0xe0 [virtio_blk]
[] vring_interrupt+0x3a/0x90
[] __handle_irq_event_percpu+0x89/0x1b0
[] handle_irq_event_percpu+0x23/0x60
[] handle_irq_event+0x3b/0x60
[] handle_edge_irq+0x6f/0x150
[] handle_irq+0x1d/0x30
[] do_IRQ+0x4b/0xd0
[] common_interrupt+0x8c/0x8c
DWARF2 unwinder stuck at ret_from_intr+0x0/0x1b
Leftover inexact backtrace:
2017-03-12 20:33:08 
2017-03-12 20:33:08  [] ? native_safe_halt+0x6/0x10
[] default_idle+0x1e/0xe0
[] arch_cpu_idle+0xf/0x20
[] default_idle_call+0x3b/0x40
[] cpu_startup_entry+0x29a/0x370
[] rest_init+0x7c/0x80
[] start_kernel+0x490/0x49d
[] ? early_idt_handler_array+0x120/0x120
[] x86_64_start_reservations+0x2a/0x2c
[] x86_64_start_kernel+0x13b/0x14a
Code: e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 48 8b 87 70 fc
ff ff 4c 8b 87 38 fe ff ff 48 c7 45 c8 00 00 00 00 48 89 75 d0 <48> 8b
b8 f0 01 00 00 48 3b 47 28 49 8b 84 24 78 fc ff ff 0f 84
RIP [] __endio_write_update_ordered+0x33/0x140 [btrfs]
RSP 
CR2: 01f0
---[ end trace 7529a0652fd7873e ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x3300 from 0x8100 (relocation range:
0x8000-0xbfff)

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues

2017-02-27 Thread Stefan Priebe - Profihost AG
Hi,

can please anybody comment on that one? Josef? Chris? I still need those
patches to be able to let btrfs run for more than 24hours without ENOSPC
issues.

Greets,
Stefan

Am 27.02.2017 um 08:22 schrieb Qu Wenruo:
> 
> 
> At 02/25/2017 04:23 PM, Stefan Priebe - Profihost AG wrote:
>> Dear Qu,
>>
>> any news on your branch? I still don't see it merged anywhere.
>>
>> Greets,
>> Stefan
> 
> I just remember that Liu Bo has commented one of the patches, I'm afraid
> I can only push these patches until I addressed his concern.
> 
> I'll start digging it as memory for this fix is quite blurred now.
> 
> Thanks,
> Qu
>>
>> Am 04.01.2017 um 17:13 schrieb Stefan Priebe - Profihost AG:
>>> Hi Qu,
>>>
>>> Am 01.01.2017 um 10:32 schrieb Qu Wenruo:
>>>> Hi Stefan,
>>>>
>>>> I'm trying to push it to for-next (will be v4.11), but no response yet.
>>>>
>>>> It would be quite nice for you to test the following git pull and give
>>>> some feedback, so that we can merge it faster.
>>>>
>>>> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html
>>>
>>> I'm also getting a notifier that wang's email does not exist anymore
>>> (wangxg.f...@cn.fujitsu.com).
>>>
>>> I would like to test that branch will need some time todo so. Last time
>>> i tried 4.10-rc1 i had the same problems like this guy:
>>> https://www.marc.info/?l=linux-btrfs=148338312525137=2
>>>
>>> Stefan
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote:
>>>>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc
>>>>>
>>>>> Stefan
>>>>>
>>>>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang:
>>>>>> When having compression enabled, Stefan Priebe ofen got enospc errors
>>>>>> though fs still has much free space. Qu Wenruo also has submitted a
>>>>>> fstests test case which can reproduce this bug steadily, please see
>>>>>> url: https://patchwork.kernel.org/patch/9420527
>>>>>>
>>>>>> First patch[1/3] "btrfs: improve inode's outstanding_extents
>>>>>> computation" is to
>>>>>> fix outstanding_extents and reserved_extents account issues. This
>>>>>> issue was revealed
>>>>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying
>>>>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these
>>>>>> warnings from
>>>>>> btrfs_destroy_inode():
>>>>>> WARN_ON(BTRFS_I(inode)->outstanding_extents);
>>>>>> WARN_ON(BTRFS_I(inode)->reserved_extents);
>>>>>> Please see this patch's commit message for detailed info, and this
>>>>>> patch is
>>>>>> necessary to patch2 and patch3.
>>>>>>
>>>>>> For false enospc, the root reasson is that for compression, its max
>>>>>> extent size will
>>>>>> be 128k, not 128MB. If we still use 128MB as max extent size to
>>>>>> reserve metadata for
>>>>>> compression, obviously it's not appropriate. In patch "btrfs:
>>>>>> Introduce COMPRESS
>>>>>> reserve type to fix false enospc for compression" commit message,
>>>>>> we explain why false enospc error occurs, please see it for detailed
>>>>>> info.
>>>>>>
>>>>>> To fix this issue, we introduce a new enum type:
>>>>>> enum btrfs_metadata_reserve_type {
>>>>>> BTRFS_RESERVE_NORMAL,
>>>>>> BTRFS_RESERVE_COMPRESS,
>>>>>> };
>>>>>> For btrfs_delalloc_[reserve|release]_metadata() and
>>>>>> btrfs_delalloc_[reserve|release]_space(), we introce a new
>>>>>> btrfs_metadata_reserve_type
>>>>>> argument, then if a path needs to go compression, we pass
>>>>>> BTRFS_RESERVE_COMPRESS,
>>>>>> otherwise pass BTRFS_RESERVE_NORMAL.
>>>>>>
>>>>>> With these patchs, Stefan no longer saw such false enospc errors, and
>>>>>> Qu Wenruo's
>>>>>> fstests test case will also pass. I have also run whole fstests
>>>>>> multiple times,
>>>>>>

Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues

2017-02-25 Thread Stefan Priebe - Profihost AG
Dear Qu,

any news on your branch? I still don't see it merged anywhere.

Greets,
Stefan

Am 04.01.2017 um 17:13 schrieb Stefan Priebe - Profihost AG:
> Hi Qu,
> 
> Am 01.01.2017 um 10:32 schrieb Qu Wenruo:
>> Hi Stefan,
>>
>> I'm trying to push it to for-next (will be v4.11), but no response yet.
>>
>> It would be quite nice for you to test the following git pull and give
>> some feedback, so that we can merge it faster.
>>
>> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html
> 
> I'm also getting a notifier that wang's email does not exist anymore
> (wangxg.f...@cn.fujitsu.com).
> 
> I would like to test that branch will need some time todo so. Last time
> i tried 4.10-rc1 i had the same problems like this guy:
> https://www.marc.info/?l=linux-btrfs=148338312525137=2
> 
> Stefan
> 
>> Thanks,
>> Qu
>>
>> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote:
>>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc
>>>
>>> Stefan
>>>
>>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang:
>>>> When having compression enabled, Stefan Priebe ofen got enospc errors
>>>> though fs still has much free space. Qu Wenruo also has submitted a
>>>> fstests test case which can reproduce this bug steadily, please see
>>>> url: https://patchwork.kernel.org/patch/9420527
>>>>
>>>> First patch[1/3] "btrfs: improve inode's outstanding_extents
>>>> computation" is to
>>>> fix outstanding_extents and reserved_extents account issues. This
>>>> issue was revealed
>>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying
>>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these
>>>> warnings from
>>>> btrfs_destroy_inode():
>>>> WARN_ON(BTRFS_I(inode)->outstanding_extents);
>>>> WARN_ON(BTRFS_I(inode)->reserved_extents);
>>>> Please see this patch's commit message for detailed info, and this
>>>> patch is
>>>> necessary to patch2 and patch3.
>>>>
>>>> For false enospc, the root reasson is that for compression, its max
>>>> extent size will
>>>> be 128k, not 128MB. If we still use 128MB as max extent size to
>>>> reserve metadata for
>>>> compression, obviously it's not appropriate. In patch "btrfs:
>>>> Introduce COMPRESS
>>>> reserve type to fix false enospc for compression" commit message,
>>>> we explain why false enospc error occurs, please see it for detailed
>>>> info.
>>>>
>>>> To fix this issue, we introduce a new enum type:
>>>> enum btrfs_metadata_reserve_type {
>>>> BTRFS_RESERVE_NORMAL,
>>>> BTRFS_RESERVE_COMPRESS,
>>>> };
>>>> For btrfs_delalloc_[reserve|release]_metadata() and
>>>> btrfs_delalloc_[reserve|release]_space(), we introce a new
>>>> btrfs_metadata_reserve_type
>>>> argument, then if a path needs to go compression, we pass
>>>> BTRFS_RESERVE_COMPRESS,
>>>> otherwise pass BTRFS_RESERVE_NORMAL.
>>>>
>>>> With these patchs, Stefan no longer saw such false enospc errors, and
>>>> Qu Wenruo's
>>>> fstests test case will also pass. I have also run whole fstests
>>>> multiple times,
>>>> no regression occurs, thanks.
>>>>
>>>> Wang Xiaoguang (3):
>>>>   btrfs: improve inode's outstanding_extents computation
>>>>   btrfs: introduce type based delalloc metadata reserve
>>>>   btrfs: Introduce COMPRESS reserve type to fix false enospc for
>>>> compression
>>>>
>>>>  fs/btrfs/ctree.h |  36 +--
>>>>  fs/btrfs/extent-tree.c   |  52 ++---
>>>>  fs/btrfs/extent_io.c |  61 ++-
>>>>  fs/btrfs/extent_io.h |   5 +
>>>>  fs/btrfs/file.c  |  25 +++--
>>>>  fs/btrfs/free-space-cache.c  |   6 +-
>>>>  fs/btrfs/inode-map.c |   6 +-
>>>>  fs/btrfs/inode.c | 246
>>>> ++-
>>>>  fs/btrfs/ioctl.c |  16 +--
>>>>  fs/btrfs/relocation.c|  14 ++-
>>>>  fs/btrfs/tests/inode-tests.c |  15 +--
>>>>  11 files changed, 381 insertions(+), 101 deletions(-)
>>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


high cpu usage due to btrfs_find_space_for_alloc and rb_next

2017-02-17 Thread Stefan Priebe - Profihost AG
Hi,

is there any chance to optimize btrfs_find_space_for_alloc / rb_next on
big devices?

I've plenty of free space but most of the time there's only low I/O but
high cpu usage. perf top shows:

  60,41%  [kernel]   [k] rb_next
   9,74%  [kernel]   [k] btrfs_find_space_for_alloc
   5,55%  [kernel]   [k] tree_search_offset.isra.25

# btrfs filesystem df /backup/
Data, single: total=14.85TiB, used=14.37TiB
System, single: total=32.00MiB, used=2.27MiB
Metadata, single: total=63.00GiB, used=54.87GiB
GlobalReserve, single: total=512.00MiB, used=80.17MiB

--
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues

2017-01-04 Thread Stefan Priebe - Profihost AG
Hi Qu,

Am 01.01.2017 um 10:32 schrieb Qu Wenruo:
> Hi Stefan,
> 
> I'm trying to push it to for-next (will be v4.11), but no response yet.
> 
> It would be quite nice for you to test the following git pull and give
> some feedback, so that we can merge it faster.
> 
> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg60418.html

I'm also getting a notifier that wang's email does not exist anymore
(wangxg.f...@cn.fujitsu.com).

I would like to test that branch will need some time todo so. Last time
i tried 4.10-rc1 i had the same problems like this guy:
https://www.marc.info/?l=linux-btrfs=148338312525137=2

Stefan

> Thanks,
> Qu
> 
> On 12/31/2016 03:31 PM, Stefan Priebe - Profihost AG wrote:
>> Any news on this series? I can't see it in 4.9 nor in 4.10-rc
>>
>> Stefan
>>
>> Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang:
>>> When having compression enabled, Stefan Priebe ofen got enospc errors
>>> though fs still has much free space. Qu Wenruo also has submitted a
>>> fstests test case which can reproduce this bug steadily, please see
>>> url: https://patchwork.kernel.org/patch/9420527
>>>
>>> First patch[1/3] "btrfs: improve inode's outstanding_extents
>>> computation" is to
>>> fix outstanding_extents and reserved_extents account issues. This
>>> issue was revealed
>>> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying
>>> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these
>>> warnings from
>>> btrfs_destroy_inode():
>>> WARN_ON(BTRFS_I(inode)->outstanding_extents);
>>> WARN_ON(BTRFS_I(inode)->reserved_extents);
>>> Please see this patch's commit message for detailed info, and this
>>> patch is
>>> necessary to patch2 and patch3.
>>>
>>> For false enospc, the root reasson is that for compression, its max
>>> extent size will
>>> be 128k, not 128MB. If we still use 128MB as max extent size to
>>> reserve metadata for
>>> compression, obviously it's not appropriate. In patch "btrfs:
>>> Introduce COMPRESS
>>> reserve type to fix false enospc for compression" commit message,
>>> we explain why false enospc error occurs, please see it for detailed
>>> info.
>>>
>>> To fix this issue, we introduce a new enum type:
>>> enum btrfs_metadata_reserve_type {
>>> BTRFS_RESERVE_NORMAL,
>>> BTRFS_RESERVE_COMPRESS,
>>> };
>>> For btrfs_delalloc_[reserve|release]_metadata() and
>>> btrfs_delalloc_[reserve|release]_space(), we introce a new
>>> btrfs_metadata_reserve_type
>>> argument, then if a path needs to go compression, we pass
>>> BTRFS_RESERVE_COMPRESS,
>>> otherwise pass BTRFS_RESERVE_NORMAL.
>>>
>>> With these patchs, Stefan no longer saw such false enospc errors, and
>>> Qu Wenruo's
>>> fstests test case will also pass. I have also run whole fstests
>>> multiple times,
>>> no regression occurs, thanks.
>>>
>>> Wang Xiaoguang (3):
>>>   btrfs: improve inode's outstanding_extents computation
>>>   btrfs: introduce type based delalloc metadata reserve
>>>   btrfs: Introduce COMPRESS reserve type to fix false enospc for
>>> compression
>>>
>>>  fs/btrfs/ctree.h |  36 +--
>>>  fs/btrfs/extent-tree.c   |  52 ++---
>>>  fs/btrfs/extent_io.c |  61 ++-
>>>  fs/btrfs/extent_io.h |   5 +
>>>  fs/btrfs/file.c  |  25 +++--
>>>  fs/btrfs/free-space-cache.c  |   6 +-
>>>  fs/btrfs/inode-map.c |   6 +-
>>>  fs/btrfs/inode.c | 246
>>> ++-
>>>  fs/btrfs/ioctl.c |  16 +--
>>>  fs/btrfs/relocation.c|  14 ++-
>>>  fs/btrfs/tests/inode-tests.c |  15 +--
>>>  11 files changed, 381 insertions(+), 101 deletions(-)
>>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] introduce type based delalloc metadata reserve to fix some false enospc issues

2016-12-30 Thread Stefan Priebe - Profihost AG
Any news on this series? I can't see it in 4.9 nor in 4.10-rc

Stefan

Am 11.11.2016 um 09:39 schrieb Wang Xiaoguang:
> When having compression enabled, Stefan Priebe ofen got enospc errors
> though fs still has much free space. Qu Wenruo also has submitted a
> fstests test case which can reproduce this bug steadily, please see
> url: https://patchwork.kernel.org/patch/9420527
> 
> First patch[1/3] "btrfs: improve inode's outstanding_extents computation" is 
> to
> fix outstanding_extents and reserved_extents account issues. This issue was 
> revealed
> by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, When modifying
> BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often gets these warnings 
> from
> btrfs_destroy_inode():
> WARN_ON(BTRFS_I(inode)->outstanding_extents);
> WARN_ON(BTRFS_I(inode)->reserved_extents);
> Please see this patch's commit message for detailed info, and this patch is
> necessary to patch2 and patch3.
> 
> For false enospc, the root reasson is that for compression, its max extent 
> size will
> be 128k, not 128MB. If we still use 128MB as max extent size to reserve 
> metadata for
> compression, obviously it's not appropriate. In patch "btrfs: Introduce 
> COMPRESS
> reserve type to fix false enospc for compression" commit message,
> we explain why false enospc error occurs, please see it for detailed info.
> 
> To fix this issue, we introduce a new enum type:
>   enum btrfs_metadata_reserve_type {
>   BTRFS_RESERVE_NORMAL,
>   BTRFS_RESERVE_COMPRESS,
>   };
> For btrfs_delalloc_[reserve|release]_metadata() and
> btrfs_delalloc_[reserve|release]_space(), we introce a new 
> btrfs_metadata_reserve_type
> argument, then if a path needs to go compression, we pass 
> BTRFS_RESERVE_COMPRESS,
> otherwise pass BTRFS_RESERVE_NORMAL.
> 
> With these patchs, Stefan no longer saw such false enospc errors, and Qu 
> Wenruo's
> fstests test case will also pass. I have also run whole fstests multiple 
> times,
> no regression occurs, thanks.
> 
> Wang Xiaoguang (3):
>   btrfs: improve inode's outstanding_extents computation
>   btrfs: introduce type based delalloc metadata reserve
>   btrfs: Introduce COMPRESS reserve type to fix false enospc for
> compression
> 
>  fs/btrfs/ctree.h |  36 +--
>  fs/btrfs/extent-tree.c   |  52 ++---
>  fs/btrfs/extent_io.c |  61 ++-
>  fs/btrfs/extent_io.h |   5 +
>  fs/btrfs/file.c  |  25 +++--
>  fs/btrfs/free-space-cache.c  |   6 +-
>  fs/btrfs/inode-map.c |   6 +-
>  fs/btrfs/inode.c | 246 
> ++-
>  fs/btrfs/ioctl.c |  16 +--
>  fs/btrfs/relocation.c|  14 ++-
>  fs/btrfs/tests/inode-tests.c |  15 +--
>  11 files changed, 381 insertions(+), 101 deletions(-)
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Metadata balance fails ENOSPC

2016-12-05 Thread Stefan Priebe - Profihost AG
isn't there a way to move free space to unallocated space again?


Am 03.12.2016 um 05:43 schrieb Andrei Borzenkov:
> 01.12.2016 18:48, Chris Murphy пишет:
>> On Thu, Dec 1, 2016 at 7:10 AM, Stefan Priebe - Profihost AG
>> <s.pri...@profihost.ag> wrote:
>>>
>>> Am 01.12.2016 um 14:51 schrieb Hans van Kranenburg:
>>>> On 12/01/2016 09:12 AM, Andrei Borzenkov wrote:
>>>>> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG
>>>>> <s.pri...@profihost.ag> wrote:
>>>>> ...
>>>>>>
>>>>>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7
>>>>>> which does the same.
>>>>>>
>>>>>>
>>>>>>>> # btrfs filesystem show /ssddisk/
>>>>>>>> Label: none  uuid: a69d2e90-c2ca-4589-9876-234446868adc
>>>>>>>> Total devices 1 FS bytes used 305.67GiB
>>>>>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1
>>>>>>>>
>>>>>>>> # btrfs filesystem usage /ssddisk/
>>>>>>>> Overall:
>>>>>>>> Device size: 500.00GiB
>>>>>>>> Device allocated:500.00GiB
>>>>>>>> Device unallocated:1.05MiB
>>>>>>>
>>>>>>> Drive is actually fully allocated so if Btrfs needs to create a new
>>>>>>> chunk right now, it can't. However,
>>>>>>
>>>>>> Yes but there's lot of free space:
>>>>>> Free (estimated):193.46GiB  (min: 193.46GiB)
>>>>>>
>>>>>> How does this match?
>>>>>>
>>>>>>
>>>>>>> All three chunk types have quite a bit of unused space in them, so
>>>>>>> it's unclear why there's a no space left error.
>>>>>>>
>>>>>
>>>>> I remember discussion that balance always tries to pre-allocate one
>>>>> chunk in advance, and I believe there was patch to correct it but I am
>>>>> not sure whether it was merged.
>>>>
>>>> http://www.spinics.net/lists/linux-btrfs/msg56772.html
>>>
>>> Thanks - still don't understand why that one is not upstream or why it
>>> was reverted. Looks absolutely reasonable to me.
>>
>> It is upstream and hasn't been reverted.
>>
>> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/fs/btrfs/volumes.c?id=refs/tags/v4.8.11
>> line 3650
>>
>> I would try Duncan's idea of using just one filter and seeing what happens:
>>
>> 'btrfs balance start -dusage=1 '
>>
> 
> Actually I just hit exactly the same symptoms on my VM where device was
> fully allocated and metadata balance failed, but data balance succeeded
> to free up space which allowed metadata balance to run too. This is
> under 4.8.10.
> 
> So it appears that balance logic between data and metadata is somehow
> different.
> 
> As this VM gets in 100% allocated condition fairly often I'd try to get
> better understanding next time.
> 
> 
>>
>>>>>> With enospc debug it says:
>>>>>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new
>>>>>> chunk for block group 839941881856
>>>>>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance
>>
>> It might be nice if this stated what kind of chunk it's trying to allocate.
>>
>>
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Metadata balance fails ENOSPC

2016-12-01 Thread Stefan Priebe - Profihost AG

Am 01.12.2016 um 16:48 schrieb Chris Murphy:
> On Thu, Dec 1, 2016 at 7:10 AM, Stefan Priebe - Profihost AG
> <s.pri...@profihost.ag> wrote:
>>
>> Am 01.12.2016 um 14:51 schrieb Hans van Kranenburg:
>>> On 12/01/2016 09:12 AM, Andrei Borzenkov wrote:
>>>> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG
>>>> <s.pri...@profihost.ag> wrote:
>>>> ...
>>>>>
>>>>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7
>>>>> which does the same.
>>>>>
>>>>>
>>>>>>> # btrfs filesystem show /ssddisk/
>>>>>>> Label: none  uuid: a69d2e90-c2ca-4589-9876-234446868adc
>>>>>>> Total devices 1 FS bytes used 305.67GiB
>>>>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1
>>>>>>>
>>>>>>> # btrfs filesystem usage /ssddisk/
>>>>>>> Overall:
>>>>>>> Device size: 500.00GiB
>>>>>>> Device allocated:500.00GiB
>>>>>>> Device unallocated:1.05MiB
>>>>>>
>>>>>> Drive is actually fully allocated so if Btrfs needs to create a new
>>>>>> chunk right now, it can't. However,
>>>>>
>>>>> Yes but there's lot of free space:
>>>>> Free (estimated):193.46GiB  (min: 193.46GiB)
>>>>>
>>>>> How does this match?
>>>>>
>>>>>
>>>>>> All three chunk types have quite a bit of unused space in them, so
>>>>>> it's unclear why there's a no space left error.
>>>>>>
>>>>
>>>> I remember discussion that balance always tries to pre-allocate one
>>>> chunk in advance, and I believe there was patch to correct it but I am
>>>> not sure whether it was merged.
>>>
>>> http://www.spinics.net/lists/linux-btrfs/msg56772.html
>>
>> Thanks - still don't understand why that one is not upstream or why it
>> was reverted. Looks absolutely reasonable to me.
> 
> It is upstream and hasn't been reverted.
> 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/fs/btrfs/volumes.c?id=refs/tags/v4.8.11
> line 3650
> 
> I would try Duncan's idea of using just one filter and seeing what happens:
> 
> 'btrfs balance start -dusage=1 '

see below:

[zabbix-db ~]# btrfs balance start -dusage=1 /ssddisk/
Done, had to relocate 0 out of 505 chunks
[zabbix-db ~]# btrfs balance start -dusage=10 /ssddisk/
Done, had to relocate 0 out of 505 chunks
[zabbix-db ~]# btrfs balance start -musage=1 /ssddisk/
ERROR: error during balancing '/ssddisk/': No space left on device
There may be more info in syslog - try dmesg | tail
[zabbix-db ~]# dmesg
[78306.288834] BTRFS warning (device vdb1): no space to allocate a new
chunk for block group 839941881856
[78306.289197] BTRFS info (device vdb1): 1 enospc errors during balance

> 
> 
>>>>> With enospc debug it says:
>>>>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new
>>>>> chunk for block group 839941881856
>>>>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance
> 
> It might be nice if this stated what kind of chunk it's trying to allocate.
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Metadata balance fails ENOSPC

2016-12-01 Thread Stefan Priebe - Profihost AG

Am 01.12.2016 um 14:51 schrieb Hans van Kranenburg:
> On 12/01/2016 09:12 AM, Andrei Borzenkov wrote:
>> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG
>> <s.pri...@profihost.ag> wrote:
>> ...
>>>
>>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7
>>> which does the same.
>>>
>>>
>>>>> # btrfs filesystem show /ssddisk/
>>>>> Label: none  uuid: a69d2e90-c2ca-4589-9876-234446868adc
>>>>> Total devices 1 FS bytes used 305.67GiB
>>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1
>>>>>
>>>>> # btrfs filesystem usage /ssddisk/
>>>>> Overall:
>>>>> Device size: 500.00GiB
>>>>> Device allocated:500.00GiB
>>>>> Device unallocated:1.05MiB
>>>>
>>>> Drive is actually fully allocated so if Btrfs needs to create a new
>>>> chunk right now, it can't. However,
>>>
>>> Yes but there's lot of free space:
>>> Free (estimated):193.46GiB  (min: 193.46GiB)
>>>
>>> How does this match?
>>>
>>>
>>>> All three chunk types have quite a bit of unused space in them, so
>>>> it's unclear why there's a no space left error.
>>>>
>>
>> I remember discussion that balance always tries to pre-allocate one
>> chunk in advance, and I believe there was patch to correct it but I am
>> not sure whether it was merged.
> 
> http://www.spinics.net/lists/linux-btrfs/msg56772.html

Thanks - still don't understand why that one is not upstream or why it
was reverted. Looks absolutely reasonable to me. Other option would be
to make it possible to make allocated unused space unallocted again - no
idea how todo that.

> 
>>>> Try remounting with enoscp_debug, and then trigger the problem again,
>>>> and post the resulting kernel messages.
>>>
>>> With enospc debug it says:
>>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new
>>> chunk for block group 839941881856
>>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Metadata balance fails ENOSPC

2016-12-01 Thread Stefan Priebe - Profihost AG

Am 01.12.2016 um 09:12 schrieb Andrei Borzenkov:
> On Thu, Dec 1, 2016 at 10:49 AM, Stefan Priebe - Profihost AG
> <s.pri...@profihost.ag> wrote:
> ...
>>
>> Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7
>> which does the same.
>>
>>
>>>> # btrfs filesystem show /ssddisk/
>>>> Label: none  uuid: a69d2e90-c2ca-4589-9876-234446868adc
>>>> Total devices 1 FS bytes used 305.67GiB
>>>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1
>>>>
>>>> # btrfs filesystem usage /ssddisk/
>>>> Overall:
>>>> Device size: 500.00GiB
>>>> Device allocated:500.00GiB
>>>> Device unallocated:1.05MiB
>>>
>>> Drive is actually fully allocated so if Btrfs needs to create a new
>>> chunk right now, it can't. However,
>>
>> Yes but there's lot of free space:
>> Free (estimated):193.46GiB  (min: 193.46GiB)
>>
>> How does this match?
>>
>>
>>> All three chunk types have quite a bit of unused space in them, so
>>> it's unclear why there's a no space left error.
>>>
> 
> I remember discussion that balance always tries to pre-allocate one
> chunk in advance, and I believe there was patch to correct it but I am
> not sure whether it was merged.

Is there otherwise a possibility to make the free space unallocated again?

Stefan

> 
>>> Try remounting with enoscp_debug, and then trigger the problem again,
>>> and post the resulting kernel messages.
>>
>> With enospc debug it says:
>> [39193.425682] BTRFS warning (device vdb1): no space to allocate a new
>> chunk for block group 839941881856
>> [39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Metadata balance fails ENOSPC

2016-11-30 Thread Stefan Priebe - Profihost AG
Am 01.12.2016 um 00:02 schrieb Chris Murphy:
> On Wed, Nov 30, 2016 at 2:03 PM, Stefan Priebe - Profihost AG
> <s.pri...@profihost.ag> wrote:
>> Hello,
>>
>> # btrfs balance start -v -dusage=0 -musage=1 /ssddisk/
>> Dumping filters: flags 0x7, state 0x0, force is off
>>   DATA (flags 0x2): balancing, usage=0
>>   METADATA (flags 0x2): balancing, usage=1
>>   SYSTEM (flags 0x2): balancing, usage=1
>> ERROR: error during balancing '/ssddisk/': No space left on device
>> There may be more info in syslog - try dmesg | tail
> 
> You haven't provided kernel messages at the time of the error.

Kernel Message:
[  429.107723] BTRFS info (device vdb1): 1 enospc errors during balance

> Also useful is the kernel version.

Custom 4.4 kernel with patches up to 4.10. But i already tried 4.9-rc7
which does the same.


>> # btrfs filesystem show /ssddisk/
>> Label: none  uuid: a69d2e90-c2ca-4589-9876-234446868adc
>> Total devices 1 FS bytes used 305.67GiB
>> devid1 size 500.00GiB used 500.00GiB path /dev/vdb1
>>
>> # btrfs filesystem usage /ssddisk/
>> Overall:
>> Device size: 500.00GiB
>> Device allocated:500.00GiB
>> Device unallocated:1.05MiB
> 
> Drive is actually fully allocated so if Btrfs needs to create a new
> chunk right now, it can't. However,

Yes but there's lot of free space:
Free (estimated):193.46GiB  (min: 193.46GiB)

How does this match?


> All three chunk types have quite a bit of unused space in them, so
> it's unclear why there's a no space left error.
> 
> Try remounting with enoscp_debug, and then trigger the problem again,
> and post the resulting kernel messages.

With enospc debug it says:
[39193.425682] BTRFS warning (device vdb1): no space to allocate a new
chunk for block group 839941881856
[39193.426033] BTRFS info (device vdb1): 1 enospc errors during balance

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Metadata balance fails ENOSPC

2016-11-30 Thread Stefan Priebe - Profihost AG
Hello,

# btrfs balance start -v -dusage=0 -musage=1 /ssddisk/
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=0
  METADATA (flags 0x2): balancing, usage=1
  SYSTEM (flags 0x2): balancing, usage=1
ERROR: error during balancing '/ssddisk/': No space left on device
There may be more info in syslog - try dmesg | tail

# btrfs filesystem show /ssddisk/
Label: none  uuid: a69d2e90-c2ca-4589-9876-234446868adc
Total devices 1 FS bytes used 305.67GiB
devid1 size 500.00GiB used 500.00GiB path /dev/vdb1

# btrfs filesystem usage /ssddisk/
Overall:
Device size: 500.00GiB
Device allocated:500.00GiB
Device unallocated:1.05MiB
Device missing:  0.00B
Used:305.69GiB
Free (estimated):185.78GiB  (min: 185.78GiB)
Data ratio:   1.00
Metadata ratio:   1.00
Global reserve:  512.00MiB  (used: 608.00KiB)

Data,single: Size:483.97GiB, Used:298.18GiB
   /dev/vdb1 483.97GiB

Metadata,single: Size:16.00GiB, Used:7.51GiB
   /dev/vdb1  16.00GiB

System,single: Size:32.00MiB, Used:144.00KiB
   /dev/vdb1  32.00MiB

Unallocated:
   /dev/vdb1   1.05MiB

How can i make it balancing again?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: resend: Re: Btrfs: adjust len of writes if following a preallocated extent

2016-11-23 Thread Stefan Priebe - Profihost AG

Am 23.11.2016 um 19:23 schrieb Holger Hoffstätte:
> On 11/23/16 18:21, Stefan Priebe - Profihost AG wrote:
>> Am 04.11.2016 um 20:20 schrieb Liu Bo:
>>> If we have
>>>
>>> |0--hole--4095||4096--preallocate--12287|
>>>
>>> instead of using preallocated space, a 8K direct write will just
>>> create a new 8K extent and it'll end up with
>>>
>>> |0--new extent--8191||8192--preallocate--12287|
>>>
>>> It's because we find a hole em and then go to create a new 8K
>>> extent directly without adjusting @len.
>>
>> after applying that one on top of my 4.4 btrfs branch (includes patches
>> up to 4.10 / next). i'm getting deadlocks in btrfs.
> 
> *ctrl+f sectorsize* .. 
> 
> That's not surprising if you did what I suspect. If your tree is based
> on my - now really very retired - 4.4.x queue, then you are likely missing
> _all the other blocksize/sectorsize patches_ that came in from Chandra
> Seetharaman et al., which I _really_ carefully patched around, for many
> good reasons.

*arg* that makes sense. Still not easy to find out which ones to skip.
Yes that one is based on yours.

thanks,
Stefan

> 
> -h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


resend: Re: Btrfs: adjust len of writes if following a preallocated extent

2016-11-23 Thread Stefan Priebe - Profihost AG
Hi,

sorry last mail was from the wrong box.

Am 04.11.2016 um 20:20 schrieb Liu Bo:
> If we have
> 
> |0--hole--4095||4096--preallocate--12287|
> 
> instead of using preallocated space, a 8K direct write will just
> create a new 8K extent and it'll end up with
> 
> |0--new extent--8191||8192--preallocate--12287|
> 
> It's because we find a hole em and then go to create a new 8K
> extent directly without adjusting @len.

after applying that one on top of my 4.4 btrfs branch (includes patches
up to 4.10 / next). i'm getting deadlocks in btrfs.

Traces here:
INFO: task btrfs-transacti:604 blocked for more than 120 seconds.
  Not tainted 4.4.34 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D 8814e78cbe00 0   604  2 0x0008
 8814e78cbe00 88017367a540 8814e2f88000 8814e78cc000
 8814e78cbe38 88123616c510 8814e24c81f0 88153fb0a000
 8814e78cbe18 816a8425 8814e63165a0 8814e78cbe88
Call Trace:
 [] schedule+0x35/0x80
 [] btrfs_commit_transaction+0x275/0xa50 [btrfs]
 [] transaction_kthread+0x1d6/0x200 [btrfs]
 [] kthread+0xdb/0x100
 [] ret_from_fork+0x3f/0x70
DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

Leftover inexact backtrace:

 [] ? kthread_park+0x60/0x60
INFO: task mysqld:1977 blocked for more than 120 seconds.
  Not tainted 4.4.34 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld  D 88142ef1bcf8 0  1977  1 0x0008
 88142ef1bcf8 81e0f500 8814dc2c4a80 88142ef1c000
 8814e32ed298 8814e32ed2c0 88110aa9a000 8814e32ed000
 88142ef1bd10 816a8425 8814e32ed000 88142ef1bd60
Call Trace:
 [] schedule+0x35/0x80
 [] wait_for_writer+0xa2/0xb0 [btrfs]
 [] btrfs_sync_log+0xe9/0xa00 [btrfs]
 [] btrfs_sync_file+0x35f/0x3d0 [btrfs]
 [] vfs_fsync_range+0x3d/0xb0
 [] do_fsync+0x3d/0x70
 [] SyS_fsync+0x10/0x20
 [] entry_SYSCALL_64_fastpath+0x12/0x71
DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71

Leftover inexact backtrace:

INFO: task mysqld:3249 blocked for more than 120 seconds.
  Not tainted 4.4.34 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld  D 881475fdfa40 0  3249  1 0x0008
 881475fdfa40 88017367ca80 8814433d2540 881475fe
 88040da39ba0 0023 88040da39c20 00238000
 881475fdfa58 816a8425 8000 881475fdfb18
Call Trace:
 [] schedule+0x35/0x80
 []
wait_ordered_extents.isra.18.constprop.23+0x147/0x3d0 [btrfs]
 [] btrfs_log_changed_extents+0x242/0x610 [btrfs]
 [] btrfs_log_inode+0x874/0xb80 [btrfs]
 [] btrfs_log_inode_parent+0x22c/0x910 [btrfs]
 [] btrfs_log_dentry_safe+0x62/0x80 [btrfs]
 [] btrfs_sync_file+0x28c/0x3d0 [btrfs]
 [] vfs_fsync_range+0x3d/0xb0
 [] do_fsync+0x3d/0x70
 [] SyS_fsync+0x10/0x20
 [] entry_SYSCALL_64_fastpath+0x12/0x71
DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71

Leftover inexact backtrace:

INFO: task mysqld:3250 blocked for more than 120 seconds.
  Not tainted 4.4.34 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld  D 881374edb868 0  3250  1 0x0008
 881374edb868 8801736b2540 8814433d4a80 881374edc000
 8814e26f81c8 8814e26f81e0 00238000 000a8000
 881374edb880 816a8425 8814433d4a80 881374edb8d8
Call Trace:
 [] schedule+0x35/0x80
 [] rwsem_down_read_failed+0xed/0x130
 [] call_rwsem_down_read_failed+0x14/0x30
DWARF2 unwinder stuck at call_rwsem_down_read_failed+0x14/0x30

Leftover inexact backtrace:

 [] ? down_read+0x17/0x20
 [] btrfs_create_dio_extent+0x46/0x1e0 [btrfs]
 [] btrfs_get_blocks_direct+0x3d8/0x730 [btrfs]
 [] ? btrfs_submit_direct+0x1ce/0x740 [btrfs]
 [] do_blockdev_direct_IO+0x11f7/0x2bc0
 [] ? btrfs_page_exists_in_range+0xe0/0xe0 [btrfs]
 [] ? btrfs_getattr+0xa0/0xa0 [btrfs]
 [] __blockdev_direct_IO+0x43/0x50
 [] ? btrfs_getattr+0xa0/0xa0 [btrfs]
 [] btrfs_direct_IO+0x1d1/0x380 [btrfs]
 [] ? btrfs_getattr+0xa0/0xa0 [btrfs]
 [] generic_file_direct_write+0xaa/0x170
 [] btrfs_file_write_iter+0x2ae/0x560 [btrfs]
 [] ? futex_wake+0x81/0x150
 [] new_sync_write+0x84/0xb0
 [] __vfs_write+0x26/0x40
 [] vfs_write+0xa9/0x190
 [] ? enter_from_user_mode+0x1f/0x50
 [] SyS_pwrite64+0x6b/0xa0
 [] ? syscall_return_slowpath+0xb0/0x130
 [] entry_SYSCALL_64_fastpath+0x12/0x71
INFO: task btrfs-transacti:604 blocked for more than 120 seconds.
  Not tainted 4.4.34 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D 8814e78cbe00 0   604  2 0x0008
 8814e78cbe00 88017367a540 8814e2f88000 8814e78cc000
 8814e78cbe38 88123616c510 8814e24c81f0 88153fb0a000
 8814e78cbe18 816a8425 8814e63165a0 8814e78cbe88
Call Trace:
 [] schedule+0x35/0x80
 [] btrfs_commit_transaction+0x275/0xa50 [btrfs]
 [] 

Re: spinning kworker with space_cache=v2 searching for free space

2016-11-11 Thread Stefan Priebe - Profihost AG

Am 12.11.2016 um 03:18 schrieb Liu Bo:
> On Wed, Nov 09, 2016 at 09:19:21PM +0100, Stefan Priebe - Profihost AG wrote:
>> Hello,
>>
>> found this one from 2014:
>> https://patchwork.kernel.org/patch/5551651/
>>
>> it this still valid?
> 
> The space cache code doesn't change a lot, so I think the patch is still
> valid to apply(there might be some conflicts though), but I'm not sure
> if it could help the spinning case.

Thanks got it applied and will try it. Any other ideas why it's pinning
there? Free space fragmentation?

But at least on one machine there are 26TB free and it's spinning...
slowing down the performance.

Greets,
Stefan

> 
> Thanks,
> 
> -liubo
>>
>> Am 09.11.2016 um 09:09 schrieb Stefan Priebe - Profihost AG:
>>> Dear list,
>>>
>>> even there's a lot of free space on my disk:
>>>
>>> # df -h /vmbackup/
>>> FilesystemSize  Used Avail Use% Mounted on
>>> /dev/mapper/stripe0-backup   37T   24T   13T  64% /backup
>>>
>>> # btrfs filesystem df /backup/
>>> Data, single: total=23.75TiB, used=22.83TiB
>>> System, DUP: total=8.00MiB, used=3.94MiB
>>> Metadata, DUP: total=283.50GiB, used=105.82GiB
>>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>>
>>> I always have a kworker process endless spinning.
>>>
>>> # perf top shows:
>>>   47,56%  [kernel]   [k] rb_next
>>>7,71%  [kernel]   [k] tree_search_offset.isra.25
>>>6,44%  [kernel]   [k] btrfs_find_space_for_alloc
>>>
>>> Mount options:
>>> rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,skip_balance
>>>
>>> What's wrong here?
>>>
>>> Greets,
>>> Stefan
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: spinning kworker with space_cache=v2 searching for free space

2016-11-09 Thread Stefan Priebe - Profihost AG
Hello,

found this one from 2014:
https://patchwork.kernel.org/patch/5551651/

it this still valid?

Am 09.11.2016 um 09:09 schrieb Stefan Priebe - Profihost AG:
> Dear list,
> 
> even there's a lot of free space on my disk:
> 
> # df -h /vmbackup/
> FilesystemSize  Used Avail Use% Mounted on
> /dev/mapper/stripe0-backup   37T   24T   13T  64% /backup
> 
> # btrfs filesystem df /backup/
> Data, single: total=23.75TiB, used=22.83TiB
> System, DUP: total=8.00MiB, used=3.94MiB
> Metadata, DUP: total=283.50GiB, used=105.82GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> I always have a kworker process endless spinning.
> 
> # perf top shows:
>   47,56%  [kernel]   [k] rb_next
>7,71%  [kernel]   [k] tree_search_offset.isra.25
>6,44%  [kernel]   [k] btrfs_find_space_for_alloc
> 
> Mount options:
> rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,skip_balance
> 
> What's wrong here?
> 
> Greets,
> Stefan
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


spinning kworker with space_cache=v2 searching for free space

2016-11-09 Thread Stefan Priebe - Profihost AG
Dear list,

even there's a lot of free space on my disk:

# df -h /vmbackup/
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/stripe0-backup   37T   24T   13T  64% /backup

# btrfs filesystem df /backup/
Data, single: total=23.75TiB, used=22.83TiB
System, DUP: total=8.00MiB, used=3.94MiB
Metadata, DUP: total=283.50GiB, used=105.82GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

I always have a kworker process endless spinning.

# perf top shows:
  47,56%  [kernel]   [k] rb_next
   7,71%  [kernel]   [k] tree_search_offset.isra.25
   6,44%  [kernel]   [k] btrfs_find_space_for_alloc

Mount options:
rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,skip_balance

What's wrong here?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


recover btrfs with kernel 4.9-rc3 but btrfs progs fails

2016-11-03 Thread Stefan Priebe - Profihost AG
Hi,

currently i've an fs which triggers this one on mount while originally
having 50% disk free - but btrfs progs fails too.

# btrfs check --repair -p /dev/vdb1
enabling repair mode
couldn't open RDWR because of unsupported option features (3).
ERROR: cannot open file system


[  164.378512] BTRFS info (device vdb1): using free space tree
[  164.378513] BTRFS info (device vdb1): has skinny extents
[  205.671655] [ cut here ]
[  205.671686] WARNING: CPU: 10 PID: 4629 at fs/btrfs/extent-tree.c:2961
btrfs_run_delayed_refs+0x28d/0x2c0 [btrfs]
[  205.671689] BTRFS: error (device vdb1) in
btrfs_run_delayed_refs:2961: errno=-28 No space left
[  205.671695] BTRFS: error (device vdb1) in
btrfs_create_pending_block_groups:10349: errno=-28 No space left
[  205.671764] BTRFS: error (device vdb1) in
btrfs_create_pending_block_groups:10353: errno=-28 No space left
[  205.671770] BTRFS: error (device vdb1) in
add_block_group_free_space:1339: errno=-28 No space left
[  205.671928] BTRFS: Transaction aborted (error -28)
[  205.671929] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4
xt_multiport iptable_filter ip_tables x_tables i2c_piix4 i2c_core button
crc32_pclmul ghash_clmulni_intel loop btrfs xor raid6_pq usbhid
ata_generic virtio_blk virtio_net uhci_hcd ehci_hcd usbcore virtio_pci
usb_common ata_piix floppy
[  205.671943] CPU: 10 PID: 4629 Comm: mount Tainted: GW
4.9.0-rc3 #1
[  205.671944] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20140722_172050-sagunt 04/01/2014
[  205.671946]  b2ed110f78f0 a23f2c93 b2ed110f7940

[  205.671948]  b2ed110f7930 a20865f1 0b9129130170
8b4e7f1bc000
[  205.671949]  8b4e29130170 8b4e27e5f000 
003105b4
[  205.671951] Call Trace:
[  205.671956]  [] dump_stack+0x63/0x90
[  205.671960]  [] __warn+0xd1/0xf0
[  205.671962]  [] warn_slowpath_fmt+0x4f/0x60
[  205.671972]  [] btrfs_run_delayed_refs+0x28d/0x2c0
[btrfs]
[  205.671982]  [] btrfs_commit_transaction+0x29/0x70
[btrfs]
[  205.671993]  [] btrfs_recover_log_trees+0x3b3/0x440
[btrfs]
[  205.672004]  [] ? replay_one_extent+0x730/0x730 [btrfs]
[  205.672013]  [] open_ctree+0x264d/0x2760 [btrfs]
[  205.672020]  [] btrfs_mount+0xcc7/0xe00 [btrfs]
[  205.672023]  [] ? pcpu_next_unpop+0x40/0x50
[  205.672025]  [] ? find_next_bit+0x15/0x20
[  205.672026]  [] ? pcpu_alloc+0x32d/0x620
[  205.672028]  [] mount_fs+0x15/0x90
[  205.672030]  [] vfs_kern_mount+0x67/0x110
[  205.672037]  [] btrfs_mount+0x2ac/0xe00 [btrfs]
[  205.672039]  [] ? pcpu_next_unpop+0x40/0x50
[  205.672040]  [] ? find_next_bit+0x15/0x20
[  205.672041]  [] mount_fs+0x15/0x90
[  205.672042]  [] vfs_kern_mount+0x67/0x110
[  205.672044]  [] do_mount+0x192/0xc30
[  205.672045]  [] ? memdup_user+0x42/0x60
[  205.672046]  [] SyS_mount+0x94/0xd0
[  205.672048]  [] do_syscall_64+0x69/0x200
[  205.672049]  [] entry_SYSCALL64_slow_path+0x25/0x25
[  205.672050] ---[ end trace be50fce8648d2575 ]---
[  205.672052] BTRFS: error (device vdb1) in
btrfs_run_delayed_refs:2961: errno=-28 No space left
[  205.672109] BTRFS: error (device vdb1) in btrfs_replay_log:2491:
errno=-28 No space left (Failed to recover log tree)
[  206.061658] BTRFS error (device vdb1): pending csums is 643072
[  206.061801] BTRFS error (device vdb1): cleaner transaction attach
returned -30
[  206.577900] BTRFS error (device vdb1): open_ctree failed

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-23 Thread Stefan Priebe - Profihost AG
Hello list,

just wanted to report that my ENOSPC errors are gone. Thanks to wang for
his great patches.

but the space_info corruption still occours.

On every umount i see:
[93022.166222] BTRFS: space_info 4 has 208952672256 free, is not full
[93022.166224] BTRFS: space_info total=363998478336, used=155045216256,
pinned=0, reserved=0, may_use=524288, readonly=65536

Greets,
Stefan

Am 29.09.2016 um 09:27 schrieb Stefan Priebe - Profihost AG:
> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>> currently
>>> I cannot confirm that as i do not have anough space to test this without
>>> compression ;-( But yes i've compression enabled.
>> I might not get you, my poor english :)
>> You mean that you only get ENOSPC error when compression is enabled?
>>
>> And when compression is not enabled, you do not get ENOSPC error?
> 
> I can't tell you. I cannot test with compression not enabled. I do not
> have anough free space on this disk.
> 
>>>> I'm trying to fix it.
>>> That sounds good but do you also get the
>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>
>>> kernel messages on umount? if not you might have found another problem.
>> Yes, I seem similar messages, you can paste you whole dmesg info here.
> 
> [ cut here ]
> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
> btrfs_free_block_groups+0x346/0x430 [btrfs]()
> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>  880fda777d00 813b69c3 
> c067a099 880fda777d38 810821c6 
> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
> Call Trace:
> [] dump_stack+0x63/0x90
> [] warn_slowpath_common+0x86/0xc0
> [] warn_slowpath_null+0x1a/0x20
> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
> [] close_ctree+0x15d/0x330 [btrfs]
> [] btrfs_put_super+0x19/0x20 [btrfs]
> [] generic_shutdown_super+0x6f/0x100
> [] kill_anon_super+0x12/0x20
> [] btrfs_kill_super+0x16/0xa0 [btrfs]
> [] deactivate_locked_super+0x43/0x70
> [] deactivate_super+0x5c/0x60
> [] cleanup_mnt+0x3f/0x90
> [] __cleanup_mnt+0x12/0x20
> [] task_work_run+0x81/0xa0
> [] exit_to_usermode_loop+0xb0/0xc0
> [] syscall_return_slowpath+0xd4/0x130
> [] int_ret_from_sys_call+0x25/0x8f
> ---[ end trace cee6ace13018e13e ]---
> [ cut here ]
> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
> btrfs_free_block_groups+0x365/0x430 [btrfs]()
> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>  880fda777d00 813b69c3 
> c067a099 880fda777d38 810821c6 
> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
> Call Trace:
> [] dump_stack+0x63/0x90
> [] warn_slowpath_common+0x86/0xc0
> [] warn_slowpath_null+0x1a/0x20
> [] btrfs_free_block_groups+0x365/0x430 [btrfs]
> [] close_ctree+0x15d/0x330 [btrfs]
> [] btrfs_put_super+0x19/0x20 [btrfs]
> [] generic_shutdown_super+0x6f/0x100
> [] kill_anon_super+0x12/0x20
> [] btrfs_kill_super+0x16/0xa0 [btrfs]
> [] deactivate_locked_super+0x43/0x70
> [] deactivate_super+0x5c/0x60
> [] cleanup_mnt+0x3f/0x90
> [] __cleanup_mnt+0x12/0x20
> [] task_work_run+0x81/0xa0
> [] exit_to_usermode_loop+0xb0/0xc0
> [] syscall_return_slowpath+0xd4/0x130
> [] in

Re: [PATCH 1/2] btrfs: improve inode's outstanding_extents computation

2016-10-23 Thread Stefan Priebe - Profihost AG
Hello list,

just want to report again that i've seen not a single ENOSPC msg with
this series applied.

Now working fine since 18 days.

Stefan

Am 14.10.2016 um 15:09 schrieb Stefan Priebe - Profihost AG:
> 
> Am 06.10.2016 um 04:51 schrieb Wang Xiaoguang:
>> This issue was revealed by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB,
>> When modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often
>> gets these warnings from btrfs_destroy_inode():
>>  WARN_ON(BTRFS_I(inode)->outstanding_extents);
>>  WARN_ON(BTRFS_I(inode)->reserved_extents);
>>
>> Simple test program below can reproduce this issue steadily.
>> Note: you need to modify BTRFS_MAX_EXTENT_SIZE to 64KB to have test,
>> otherwise there won't be such WARNING.
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>>
>>  int main(void)
>>  {
>>  int fd;
>>  char buf[68 *1024];
>>
>>  memset(buf, 0, 68 * 1024);
>>  fd = open("testfile", O_CREAT | O_EXCL | O_RDWR);
>>  pwrite(fd, buf, 68 * 1024, 64 * 1024);
>>  return;
>>  }
>>
>> When BTRFS_MAX_EXTENT_SIZE is 64KB, and buffered data range is:
>> 64KB 128K132KB
>> |---|---|
>>  64 + 4KB
>>
>> 1) for above data range, btrfs_delalloc_reserve_metadata() will reserve
>> metadata and set BTRFS_I(inode)->outstanding_extents to 2.
>> (68KB + 64KB - 1) / 64KB == 2
>>
>> Outstanding_extents: 2
>>
>> 2) then btrfs_dirty_page() will be called to dirty pages and set
>> EXTENT_DELALLOC flag. In this case, btrfs_set_bit_hook() will be called
>> twice.
>> The 1st set_bit_hook() call will set DEALLOC flag for the first 64K.
>> 64KB 128KB
>> |---|
>>  64KB DELALLOC
>> Outstanding_extents: 2
>>
>> Set_bit_hooks() uses FIRST_DELALLOC flag to avoid re-increase
>> outstanding_extents counter.
>> So for 1st set_bit_hooks() call, it won't modify outstanding_extents,
>> it's still 2.
>>
>> Then FIRST_DELALLOC flag is *CLEARED*.
>>
>> 3) 2nd btrfs_set_bit_hook() call.
>> Because FIRST_DELALLOC have been cleared by previous set_bit_hook(),
>> btrfs_set_bit_hook() will increase BTRFS_I(inode)->outstanding_extents by
>> one, so now BTRFS_I(inode)->outstanding_extents is 3.
>> 64KB128KB132KB
>> |---||
>>  64K DELALLOC   4K DELALLOC
>> Outstanding_extents: 3
>>
>> But the correct outstanding_extents number should be 2, not 3.
>> The 2nd btrfs_set_bit_hook() call just screwed up this, and leads to the
>> WARN_ON().
>>
>> Normally, we can solve it by only increasing outstanding_extents in
>> set_bit_hook().
>> But the problem is for delalloc_reserve/release_metadata(), we only have
>> a 'length' parameter, and calculate in-accurate outstanding_extents.
>> If we only rely on set_bit_hook() release_metadata() will crew things up
>> as it will decrease inaccurate number.
>>
>> So the fix we use is:
>> 1) Increase *INACCURATE* outstanding_extents at delalloc_reserve_meta
>>Just as a place holder.
>> 2) Increase *accurate* outstanding_extents at set_bit_hooks()
>>This is the real increaser.
>> 3) Decrease *INACCURATE* outstanding_extents before returning
>>This makes outstanding_extents to correct value.
>>
>> For 128M BTRFS_MAX_EXTENT_SIZE, due to limitation of
>> __btrfs_buffered_write(), each iteration will only handle about 2MB
>> data.
>> So btrfs_dirty_pages() won't need to handle cases cross 2 extents.
>>
>> Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com>
> 
> Tested-by: Stefan Priebe <s.pri...@profihost.ag>
> 
> Works fine since 8 days - no ENOSPC errors anymore.
> 
> Greets,
> Stefan
> 
>> ---
>>  fs/btrfs/ctree.h |  2 ++
>>  fs/btrfs/inode.c | 65 
>> ++--
>>  fs/btrfs/ioctl.c |  6 ++
>>  3 files changed, 62 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 33fe035..16885f6 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -3119,6 +3119,

Re: speed up cp --reflink=always

2016-10-17 Thread Stefan Priebe - Profihost AG
Am 17.10.2016 um 03:50 schrieb Qu Wenruo:
> At 10/17/2016 02:54 AM, Stefan Priebe - Profihost AG wrote:
>> Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg:
>>> Hi,
>>>
>>> On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote:
>>>>
>>>> cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes)
>>>>
>>>> An example:
>>>>
>>>> source file:
>>>> # ls -la vm-279-disk-1.img
>>>> -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img
>>>>
>>>> target file after around 10 minutes:
>>>> # ls -la vm-279-disk-1.img.tmp
>>>> -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp
>>>
>>> Two quick thoughts:
>>> 1. How many extents does this img have?
>>
>> filefrag says:
>> 1011508 extents found
> 
> Too many fragments.
> Average extent size is only about 200K.
> Quite common for VM images, if not setting no copy-on-write (C) attr.
> 
> Normally it's not a good idea to put VM images into btrfs without any
> tuning.

Those are backups just written sequentially once. As far as i know the
extent size is hardcoded to 128k for compression. Isn't it?

Stefan

> Thanks,
> Qu
>>
>>> 2. Is this an XY problem? Why not just put the img in a subvolume and
>>> snapshot that?
>>
>> Sorry what's XY problem?
>>
>> Implementing cp reflink was easier - as the original code was based on
>> XFS. But shouldn't be cp reflink / clone a file be nearly identical to a
>> snapshot? Just creating refs to the extents?
>>
>> Greets,
>> Stefan
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: speed up cp --reflink=always

2016-10-16 Thread Stefan Priebe - Profihost AG
Am 16.10.2016 um 21:48 schrieb Hans van Kranenburg:
> On 10/16/2016 08:54 PM, Stefan Priebe - Profihost AG wrote:
>> Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg:
>>> On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote:
>>>>
>>>> cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes)
>>>>
>>>> An example:
>>>>
>>>> source file:
>>>> # ls -la vm-279-disk-1.img
>>>> -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img
>>>>
>>>> target file after around 10 minutes:
>>>> # ls -la vm-279-disk-1.img.tmp
>>>> -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp
>>>
>>> Two quick thoughts:
>>> 1. How many extents does this img have?
>>
>> filefrag says:
>> 1011508 extents found
> 
> To cp --reflink this, the filesystem needs to create a million new
> EXTENT_DATA objects for the new file, which point all parts of the new
> file to all the little same parts of the old file, and probably also
> needs to update a million EXTENT_DATA objects in the btrees to add a
> second backreference back to the new file.

Thanks for this explanation.

> 
>>> 2. Is this an XY problem? Why not just put the img in a subvolume and
>>> snapshot that?
>>
>> Sorry what's XY problem?
> 
> It means that I suspected that your actual goal is not spending time to
> work on optimizing how cp --reflink works, but that you just want to use
> the quickest way to have a clone of the file.
> 
> An XY problem is when someone has problem X, then thinks about solution
> Y to solve it, then runs into a problem/limitation/whatever when trying
> Y and asks help with that actual problem when doing Y while there might
> in the end be a better solution to get X done.

ah ;-) makes sense.

>> Implementing cp reflink was easier - as the original code was based on
>> XFS. But shouldn't be cp reflink / clone a file be nearly identical to a
>> snapshot? Just creating refs to the extents?
> 
> Snapshotting a subvolume only has to write a cowed copy of the top-level
> information of the subvolume filesystem tree, and leaves the extent tree
> alone. It doesn't have to do 2 million different things. \o/

Thanks for this explanation. Will look into switching to subvolumes.
Wasn't able todo this before as i was always running into ENOSPC issues
which was solved last week.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: speed up cp --reflink=always

2016-10-16 Thread Stefan Priebe - Profihost AG
Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg:
> Hi,
> 
> On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote:
>>
>> cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes)
>>
>> An example:
>>
>> source file:
>> # ls -la vm-279-disk-1.img
>> -rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img
>>
>> target file after around 10 minutes:
>> # ls -la vm-279-disk-1.img.tmp
>> -rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp
> 
> Two quick thoughts:
> 1. How many extents does this img have?

filefrag says:
1011508 extents found

> 2. Is this an XY problem? Why not just put the img in a subvolume and
> snapshot that?

Sorry what's XY problem?

Implementing cp reflink was easier - as the original code was based on
XFS. But shouldn't be cp reflink / clone a file be nearly identical to a
snapshot? Just creating refs to the extents?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


speed up cp --reflink=always

2016-10-15 Thread Stefan Priebe - Profihost AG
Hello,

cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes)

An example:

source file:
# ls -la vm-279-disk-1.img
-rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img

target file after around 10 minutes:
# ls -la vm-279-disk-1.img.tmp
-rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp

I/O Waits are at around 6% but disk usage is at around 100%.

The process using most of the disk I/O is a kworker process. A function
trace of this kworker for 30s is already 44MB - no idea where to upload.
This volume uses space_cache=v2.

While digging through it i see a lot of this calls:

   kworker/u65:4-20679 [007]  46021.641882: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641882: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641883: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641884: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641885: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641885: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641885: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641885: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641885: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641885: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641886: btrfs_set_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641886: btrfs_get_token_32
<-btrfs_del_items
   kworker/u65:4-20679 [007]  46021.641886: btrfs_set_token_32
<-btrfs_del_items

Sorting the calls shows:
   4892 _raw_spin_lock <-free_extent_buffer
   4894 release_extent_buffer <-free_extent_buffer
   6803 map_private_extent_buffer <-generic_bin_search.constprop.36
   6839 __set_page_dirty_nobuffers <-btree_set_page_dirty
   6840 btree_set_page_dirty <-set_page_dirty
   6840 mem_cgroup_begin_page_stat <-__set_page_dirty_nobuffers
   6840 page_mapping <-set_page_dirty
   6840 set_page_dirty <-set_extent_buffer_dirty
   6841 mem_cgroup_end_page_stat <-__set_page_dirty_nobuffers
   7521 btrfs_clear_lock_blocking_rw <-btrfs_clear_path_blocking
   7967 btrfs_get_token_64 <-read_block_for_search.isra.33
   8018 btrfs_set_token_32 <-btrfs_del_items
   8235 btrfs_get_token_32 <-btrfs_del_items
   8813 btrfs_set_lock_blocking_rw <-btrfs_set_path_blocking
   9235 map_private_extent_buffer <-btrfs_get_token_32
  11824 btrfs_set_token_32 <-btrfs_extend_item
  12090 map_private_extent_buffer <-btrfs_get_token_64
  12367 mark_page_accessed <-mark_extent_buffer_accessed
  12621 btrfs_get_token_32 <-btrfs_extend_item
  16267 

Re: btrfs and numa - needing drop_caches to keep speed up

2016-10-14 Thread Stefan Priebe - Profihost AG
Hi,
Am 14.10.2016 um 15:19 schrieb Stefan Priebe - Profihost AG:
> Dear julian,
> 
> Am 14.10.2016 um 14:26 schrieb Julian Taylor:
>> On 10/14/2016 08:28 AM, Stefan Priebe - Profihost AG wrote:
>>> Hello list,
>>>
>>> while running the same workload on two machines (single xeon and a dual
>>> xeon) both with 64GB RAM.
>>>
>>> I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to
>>> keep the speed as good as on the non numa system. I'm not sure whether
>>> this is related to numa.
>>>
>>> Is there any sysctl parameter to tune?
>>>
>>> Tested with vanilla v4.8.1
>>>
>>> Greets,
>>> Stefan
>>
>> hi,
>> why do you think this is related to btrfs?
> 
> was just an idea as i couldn't find any other difference between those
> systems.
> 
>> This is easy to diagnose but recording some kernel stacks during the >
> problem with perf.
> 
> you just mean perf top? Does it also show locking problems? As i see not
> much CPU usage in that case.


perf top looks like this:
   5,46%  libc-2.19.so   [.] memset
   5,26%  [kernel]   [k] page_fault
   3,63%  [kernel]   [k] clear_page_c_e
   1,38%  [kernel]   [k] _raw_spin_lock
   1,06%  [kernel]   [k] get_page_from_freelist
   0,83%  [kernel]   [k] copy_user_enhanced_fast_string
   0,79%  [kernel]   [k] release_pages
   0,68%  [kernel]   [k] handle_mm_fault
   0,57%  [kernel]   [k] free_hot_cold_page
   0,55%  [kernel]   [k] handle_pte_fault
   0,54%  [kernel]   [k] __pagevec_lru_add_fn
   0,45%  [kernel]   [k] unmap_page_range
   0,45%  [kernel]   [k] __mod_zone_page_state
   0,43%  [kernel]   [k] page_add_new_anon_rmap
   0,38%  [kernel]   [k] free_pcppages_bulk

> 
>> The only known issue that has this type of workaround that I know of are
>> transparent huge pages.
> 
> I already disabled thp by:
> echo never > /sys/kernel/mm/transparent_hugepage/enabled
> 
> cat /proc/meminfo says:
> HugePages_Total:   0
> HugePages_Free:0
> HugePages_Rsvd:0
> HugePages_Surp:0
> 
> 
> 
> Greets,
> Stefan
> 
>>
>> cheers,
>> Julian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and numa - needing drop_caches to keep speed up

2016-10-14 Thread Stefan Priebe - Profihost AG
Dear julian,

Am 14.10.2016 um 14:26 schrieb Julian Taylor:
> On 10/14/2016 08:28 AM, Stefan Priebe - Profihost AG wrote:
>> Hello list,
>>
>> while running the same workload on two machines (single xeon and a dual
>> xeon) both with 64GB RAM.
>>
>> I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to
>> keep the speed as good as on the non numa system. I'm not sure whether
>> this is related to numa.
>>
>> Is there any sysctl parameter to tune?
>>
>> Tested with vanilla v4.8.1
>>
>> Greets,
>> Stefan
> 
> hi,
> why do you think this is related to btrfs?

was just an idea as i couldn't find any other difference between those
systems.

> This is easy to diagnose but recording some kernel stacks during the >
problem with perf.

you just mean perf top? Does it also show locking problems? As i see not
much CPU usage in that case.

> The only known issue that has this type of workaround that I know of are
> transparent huge pages.

I already disabled thp by:
echo never > /sys/kernel/mm/transparent_hugepage/enabled

cat /proc/meminfo says:
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0



Greets,
Stefan

> 
> cheers,
> Julian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs: fix false enospc for compression

2016-10-14 Thread Stefan Priebe - Profihost AG

Am 06.10.2016 um 04:51 schrieb Wang Xiaoguang:
> When testing btrfs compression, sometimes we got ENOSPC error, though fs
> still has much free space, xfstests generic/171, generic/172, generic/173,
> generic/174, generic/175 can reveal this bug in my test environment when
> compression is enabled.
> 
> After some debuging work, we found that it's btrfs_delalloc_reserve_metadata()
> which sometimes tries to reserve plenty of metadata space, even for very small
> data range. In btrfs_delalloc_reserve_metadata(), the number of metadata bytes
> we try to reserve is calculated by the difference between outstanding_extents
> and reserved_extents. Please see below case for how ENOSPC occurs:
> 
>   1, Buffered write 128MB data in unit of 128KB, so finially we'll have inode
> outstanding extents be 1, and reserved_extents be 1024. Note it's
> btrfs_merge_extent_hook() that merges these 128KB units into one big
> outstanding extent, but do not change reserved_extents.
> 
>   2, When writing dirty pages, for compression, cow_file_range_async() will
> split above big extent in unit of 128KB(compression extent size is 128KB).
> When first split opeartion finishes, we'll have 2 outstanding extents and 1024
> reserved extents, and just right now the currently generated ordered extent is
> dispatched to run and complete, then btrfs_delalloc_release_metadata()(see
> btrfs_finish_ordered_io()) will be called to release metadata, after that we
> will have 1 outstanding extents and 1 reserved extents(also see logic in
> drop_outstanding_extent()). Later cow_file_range_async() continues to handles
> left data range[128KB, 128MB), and if no other ordered extent was dispatched
> to run, there will be 1023 outstanding extents and 1 reserved extent.
> 
>   3, Now if another bufferd write for this file enters, then
> btrfs_delalloc_reserve_metadata() will at least try to reserve metadata
> for 1023 outstanding extents' metadata, for 16KB node size, it'll be 
> 1023*16384*2*8,
> about 255MB, for 64K node size, it'll be 1023*65536*8*2, about 1GB metadata, 
> so
> obviously it's not sane and can easily result in enospc error.
> 
> The root cause is that for compression, its max extent size will no longer be
> BTRFS_MAX_EXTENT_SIZE(128MB), it'll be 128KB, so current metadata reservation
> method in btrfs is not appropriate or correct, here we introduce:
>   enum btrfs_metadata_reserve_type {
>   BTRFS_RESERVE_NORMAL,
>   BTRFS_RESERVE_COMPRESS,
>   };
> and expand btrfs_delalloc_reserve_metadata() and 
> btrfs_delalloc_reserve_space()
> by adding a new enum btrfs_metadata_reserve_type argument. When a data range 
> will
> go through compression, we use BTRFS_RESERVE_COMPRESS to reserve metatata.
> Meanwhile we introduce EXTENT_COMPRESS flag to mark a data range that will go
> through compression path.
> 
> With this patch, we can fix these false enospc error for compression.
> 
> Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com>

Tested-by: Stefan Priebe <s.pri...@profihost.ag>

Works fine since 8 days - no ENOSPC errors anymore.

Greets,
Stefan


> ---
>  fs/btrfs/ctree.h |  31 ++--
>  fs/btrfs/extent-tree.c   |  55 +
>  fs/btrfs/extent_io.c |  59 +-
>  fs/btrfs/extent_io.h |   2 +
>  fs/btrfs/file.c  |  26 +--
>  fs/btrfs/free-space-cache.c  |   6 +-
>  fs/btrfs/inode-map.c |   5 +-
>  fs/btrfs/inode.c | 181 
> ---
>  fs/btrfs/ioctl.c |  12 ++-
>  fs/btrfs/relocation.c|  14 +++-
>  fs/btrfs/tests/inode-tests.c |  15 ++--
>  11 files changed, 309 insertions(+), 97 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 16885f6..fa6a19a 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -97,6 +97,19 @@ static const int btrfs_csum_sizes[] = { 4 };
>  
>  #define BTRFS_DIRTY_METADATA_THRESH  SZ_32M
>  
> +/*
> + * for compression, max file extent size would be limited to 128K, so when
> + * reserving metadata for such delalloc writes, pass BTRFS_RESERVE_COMPRESS 
> to
> + * btrfs_delalloc_reserve_metadata() or btrfs_delalloc_reserve_space() to
> + * calculate metadata, for none-compression, use BTRFS_RESERVE_NORMAL.
> + */
> +enum btrfs_metadata_reserve_type {
> + BTRFS_RESERVE_NORMAL,
> + BTRFS_RESERVE_COMPRESS,
> +};
> +int inode_need_compress(struct inode *inode);
> +u64 btrfs_max_extent_size(enum btrfs_metadata_reserve_type reserve_type);
> +
>  #define BTRFS_MAX_EXTENT_SIZE SZ_128M
>  
>  struct btrfs_mapping_tree {
> @@ -2677,10 +2690,14 @@ int btrfs_subvolume_reserve_metadata(struct 
&g

Re: [PATCH 1/2] btrfs: improve inode's outstanding_extents computation

2016-10-14 Thread Stefan Priebe - Profihost AG

Am 06.10.2016 um 04:51 schrieb Wang Xiaoguang:
> This issue was revealed by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB,
> When modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often
> gets these warnings from btrfs_destroy_inode():
>   WARN_ON(BTRFS_I(inode)->outstanding_extents);
>   WARN_ON(BTRFS_I(inode)->reserved_extents);
> 
> Simple test program below can reproduce this issue steadily.
> Note: you need to modify BTRFS_MAX_EXTENT_SIZE to 64KB to have test,
> otherwise there won't be such WARNING.
>   #include 
>   #include 
>   #include 
>   #include 
>   #include 
> 
>   int main(void)
>   {
>   int fd;
>   char buf[68 *1024];
> 
>   memset(buf, 0, 68 * 1024);
>   fd = open("testfile", O_CREAT | O_EXCL | O_RDWR);
>   pwrite(fd, buf, 68 * 1024, 64 * 1024);
>   return;
>   }
> 
> When BTRFS_MAX_EXTENT_SIZE is 64KB, and buffered data range is:
> 64KB  128K132KB
> |---|---|
>  64 + 4KB
> 
> 1) for above data range, btrfs_delalloc_reserve_metadata() will reserve
> metadata and set BTRFS_I(inode)->outstanding_extents to 2.
> (68KB + 64KB - 1) / 64KB == 2
> 
> Outstanding_extents: 2
> 
> 2) then btrfs_dirty_page() will be called to dirty pages and set
> EXTENT_DELALLOC flag. In this case, btrfs_set_bit_hook() will be called
> twice.
> The 1st set_bit_hook() call will set DEALLOC flag for the first 64K.
> 64KB  128KB
> |---|
>   64KB DELALLOC
> Outstanding_extents: 2
> 
> Set_bit_hooks() uses FIRST_DELALLOC flag to avoid re-increase
> outstanding_extents counter.
> So for 1st set_bit_hooks() call, it won't modify outstanding_extents,
> it's still 2.
> 
> Then FIRST_DELALLOC flag is *CLEARED*.
> 
> 3) 2nd btrfs_set_bit_hook() call.
> Because FIRST_DELALLOC have been cleared by previous set_bit_hook(),
> btrfs_set_bit_hook() will increase BTRFS_I(inode)->outstanding_extents by
> one, so now BTRFS_I(inode)->outstanding_extents is 3.
> 64KB128KB132KB
> |---||
>   64K DELALLOC   4K DELALLOC
> Outstanding_extents: 3
> 
> But the correct outstanding_extents number should be 2, not 3.
> The 2nd btrfs_set_bit_hook() call just screwed up this, and leads to the
> WARN_ON().
> 
> Normally, we can solve it by only increasing outstanding_extents in
> set_bit_hook().
> But the problem is for delalloc_reserve/release_metadata(), we only have
> a 'length' parameter, and calculate in-accurate outstanding_extents.
> If we only rely on set_bit_hook() release_metadata() will crew things up
> as it will decrease inaccurate number.
> 
> So the fix we use is:
> 1) Increase *INACCURATE* outstanding_extents at delalloc_reserve_meta
>Just as a place holder.
> 2) Increase *accurate* outstanding_extents at set_bit_hooks()
>This is the real increaser.
> 3) Decrease *INACCURATE* outstanding_extents before returning
>This makes outstanding_extents to correct value.
> 
> For 128M BTRFS_MAX_EXTENT_SIZE, due to limitation of
> __btrfs_buffered_write(), each iteration will only handle about 2MB
> data.
> So btrfs_dirty_pages() won't need to handle cases cross 2 extents.
> 
> Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com>

Tested-by: Stefan Priebe <s.pri...@profihost.ag>

Works fine since 8 days - no ENOSPC errors anymore.

Greets,
Stefan

> ---
>  fs/btrfs/ctree.h |  2 ++
>  fs/btrfs/inode.c | 65 
> ++--
>  fs/btrfs/ioctl.c |  6 ++
>  3 files changed, 62 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 33fe035..16885f6 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3119,6 +3119,8 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info 
> *fs_info, int delay_iput,
>  int nr);
>  int btrfs_set_extent_delalloc(struct inode *inode, u64 start, u64 end,
> struct extent_state **cached_state);
> +int btrfs_set_extent_defrag(struct inode *inode, u64 start, u64 end,
> + struct extent_state **cached_state);
>  int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
>struct btrfs_root *new_root,
>struct btrfs_root *parent_r

btrfs and numa - needing drop_caches to keep speed up

2016-10-14 Thread Stefan Priebe - Profihost AG
Hello list,

while running the same workload on two machines (single xeon and a dual
xeon) both with 64GB RAM.

I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to
keep the speed as good as on the non numa system. I'm not sure whether
this is related to numa.

Is there any sysctl parameter to tune?

Tested with vanilla v4.8.1

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-10 Thread Stefan Priebe - Profihost AG
Dear Wang,

Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
> Hi,
> 
> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>> currently
>>>> I cannot confirm that as i do not have anough space to test this
>>>> without
>>>> compression ;-( But yes i've compression enabled.
>>> I might not get you, my poor english :)
>>> You mean that you only get ENOSPC error when compression is enabled?
>>>
>>> And when compression is not enabled, you do not get ENOSPC error?
>> I can't tell you. I cannot test with compression not enabled. I do not
>> have anough free space on this disk.
> I had just sent two patches to fix false enospc error for compression,
> please have a try, they fix false enospc error in my test environment.
> btrfs: fix false enospc for compression
> btrfs: improve inode's outstanding_extents computation
> 
> I apply these two patchs in linux upstream tree, the latest commit
> is 41844e36206be90cd4d962ea49b0abc3612a99d0.

no space errors since 5 days! that's currently amazing. I Hope it stays
this and your patches get into 4.9.

Greets,
Stefan

> 
> Regards,
> Xiaoguang Wang
> 
>>
>>>>> I'm trying to fix it.
>>>> That sounds good but do you also get the
>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>
>>>> kernel messages on umount? if not you might have found another problem.
>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>> [] warn_slowpath_null+0x1a/0x20
>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>> [] close_ctree+0x15d/0x330 [btrfs]
>> [] btrfs_put_super+0x19/0x20 [btrfs]
>> [] generic_shutdown_super+0x6f/0x100
>> [] kill_anon_super+0x12/0x20
>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>> [] deactivate_locked_super+0x43/0x70
>> [] deactivate_super+0x5c/0x60
>> [] cleanup_mnt+0x3f/0x90
>> [] __cleanup_mnt+0x12/0x20
>> [] task_work_run+0x81/0xa0
>> [] exit_to_usermode_loop+0xb0/0xc0
>> [] syscall_return_slowpath+0xd4/0x130
>> [] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace cee6ace13018e13e ]---
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
>> btrfs_free_block_groups+0x365/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
&

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-08 Thread Stefan Priebe - Profihost AG
main difference between the system where oom happens is:
- Single Xeon  => no OOM
- Dual Xeon / NUMA => OOM

both 64GB mem.
Am 07.10.2016 um 11:33 schrieb Holger Hoffstätte:
> On 10/07/16 09:17, Wang Xiaoguang wrote:
>> Hi,
>>
>> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote:
>>> Dear Wang,
>>>
>>> can't use v4.8.0 as i always get OOMs and total machine crashes.
>>>
>>> Complete traces with your patch and some more btrfs patches applied (in
>>> the hope in fixes the OOM but it did not):
>>> http://pastebin.com/raw/6vmRSDm1
>> I didn't see any such OOMs...
>> Can you try holger's tree with my patches.
> 
> They don't really apply to either 4.4.x (because it has diverged too
> much by now) or 4.8.x because of the initial dedupe support which came
> in as part of 4.9rc1 - there are way too many conflicts all over the
> place and merging them manually took way too much time.
> It would be useful if you could rebase your patches to for-next.
> 
> Stefan, have you tried setting THP to 'madvise' or 'never'?
> Try 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled'
> or boot with transparent_hugepage=madvise (or never) kernel flag.
> I have no idea if it will help, but it's worth a try.
> 
> -h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-08 Thread Stefan Priebe - Profihost AG
Hi Wang,

currently on the system where it's working fine - no ENOSPC error. But
it will take a week to be sure they don't come back.

Thanks!

Greets,
Stefan
Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
> Hi,
> 
> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>> currently
>>>> I cannot confirm that as i do not have anough space to test this
>>>> without
>>>> compression ;-( But yes i've compression enabled.
>>> I might not get you, my poor english :)
>>> You mean that you only get ENOSPC error when compression is enabled?
>>>
>>> And when compression is not enabled, you do not get ENOSPC error?
>> I can't tell you. I cannot test with compression not enabled. I do not
>> have anough free space on this disk.
> I had just sent two patches to fix false enospc error for compression,
> please have a try, they fix false enospc error in my test environment.
> btrfs: fix false enospc for compression
> btrfs: improve inode's outstanding_extents computation
> 
> I apply these two patchs in linux upstream tree, the latest commit
> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
> 
> Regards,
> Xiaoguang Wang
> 
>>
>>>>> I'm trying to fix it.
>>>> That sounds good but do you also get the
>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>
>>>> kernel messages on umount? if not you might have found another problem.
>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>> [] warn_slowpath_null+0x1a/0x20
>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>> [] close_ctree+0x15d/0x330 [btrfs]
>> [] btrfs_put_super+0x19/0x20 [btrfs]
>> [] generic_shutdown_super+0x6f/0x100
>> [] kill_anon_super+0x12/0x20
>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>> [] deactivate_locked_super+0x43/0x70
>> [] deactivate_super+0x5c/0x60
>> [] cleanup_mnt+0x3f/0x90
>> [] __cleanup_mnt+0x12/0x20
>> [] task_work_run+0x81/0xa0
>> [] exit_to_usermode_loop+0xb0/0xc0
>> [] syscall_return_slowpath+0xd4/0x130
>> [] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace cee6ace13018e13e ]---
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
>> btrfs_free_block_groups+0x365/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowp

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-07 Thread Stefan Priebe - Profihost AG
Hi Holger,

Am 07.10.2016 um 11:33 schrieb Holger Hoffstätte:
> On 10/07/16 09:17, Wang Xiaoguang wrote:
>> Hi,
>>
>> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote:
>>> Dear Wang,
>>>
>>> can't use v4.8.0 as i always get OOMs and total machine crashes.
>>>
>>> Complete traces with your patch and some more btrfs patches applied (in
>>> the hope in fixes the OOM but it did not):
>>> http://pastebin.com/raw/6vmRSDm1
>> I didn't see any such OOMs...
>> Can you try holger's tree with my patches.
> 
> They don't really apply to either 4.4.x (because it has diverged too
> much by now) or 4.8.x because of the initial dedupe support which came
> in as part of 4.9rc1 - there are way too many conflicts all over the
> place and merging them manually took way too much time.
> It would be useful if you could rebase your patches to for-next.
> 
> Stefan, have you tried setting THP to 'madvise' or 'never'?
> Try 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled'
> or boot with transparent_hugepage=madvise (or never) kernel flag.
> I have no idea if it will help, but it's worth a try.

It's already set to never. The hosts are currently still up and running
but only if i run
echo 3 >/proc/sys/vm/drop_caches

every 30 minutes. It seems the kernel fails to reclaim the cache itself
if user space needs memory.

Greets,
Stefan

> 
> -h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-07 Thread Stefan Priebe - Profihost AG
Am 07.10.2016 um 10:07 schrieb Wang Xiaoguang:
> hello,
> 
> On 10/07/2016 04:06 PM, Stefan Priebe - Profihost AG wrote:
>> and it shows:
>>
>> PAG | scan 33829e5  | steal 1968e3  | stall  0  |  |
>>|   |  swin  257071 |  swout 346960 |
>>
>> but the highest user space prog uses only 190MB.
> If you don't apply my patches, there will be no OOMs in your test
> environment?
> I want to confirm whether this OOM is caused by my patches...

This happens also without your patches. That's what i meant with can't
use v4.8.0.

Is it OK to try v4.7.6?

Greets,
Stefan

> 
> Regards,
> Xiaoguang Wang
> 
>>
>> greets,
>> Stefan
>>
>> Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang:
>>> Hi,
>>>
>>> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote:
>>>> Dear Wang,
>>>>
>>>> can't use v4.8.0 as i always get OOMs and total machine crashes.
>>>>
>>>> Complete traces with your patch and some more btrfs patches applied (in
>>>> the hope in fixes the OOM but it did not):
>>>> http://pastebin.com/raw/6vmRSDm1
>>> I didn't see any such OOMs...
>>> Can you try holger's tree with my patches.
>>>
>>> Regards,
>>> Xiaoguang Wang
>>>> Greets,
>>>> Stefan
>>>> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
>>>>> Hi,
>>>>>
>>>>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>>>>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>>>>>> I found that compress sometime report ENOSPC error even in
>>>>>>>>> 4.8-rc8,
>>>>>>>>> currently
>>>>>>>> I cannot confirm that as i do not have anough space to test this
>>>>>>>> without
>>>>>>>> compression ;-( But yes i've compression enabled.
>>>>>>> I might not get you, my poor english :)
>>>>>>> You mean that you only get ENOSPC error when compression is enabled?
>>>>>>>
>>>>>>> And when compression is not enabled, you do not get ENOSPC error?
>>>>>> I can't tell you. I cannot test with compression not enabled. I do
>>>>>> not
>>>>>> have anough free space on this disk.
>>>>> I had just sent two patches to fix false enospc error for compression,
>>>>> please have a try, they fix false enospc error in my test environment.
>>>>>   btrfs: fix false enospc for compression
>>>>>   btrfs: improve inode's outstanding_extents computation
>>>>>
>>>>> I apply these two patchs in linux upstream tree, the latest commit
>>>>> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
>>>>>
>>>>> Regards,
>>>>> Xiaoguang Wang
>>>>>
>>>>>>>>> I'm trying to fix it.
>>>>>>>> That sounds good but do you also get the
>>>>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>>>>>
>>>>>>>> kernel messages on umount? if not you might have found another
>>>>>>>> problem.
>>>>>>> Yes, I seem similar messages, you can paste you whole dmesg info
>>>>>>> here.
>>>>>> [ cut here ]
>>>>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>>>>>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>>>>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>>>>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp
>>>>>> kvm_intel kvm
>>>>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>>>>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>>>>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>>>>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb
>>>>>> i2c_algo_bit
>>>>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>>>>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>>>>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>>>>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>>>>> 

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-07 Thread Stefan Priebe - Profihost AG
and it shows:

PAG | scan 33829e5  | steal 1968e3  | stall  0  |  |
  |   |  swin  257071 |  swout 346960 |

but the highest user space prog uses only 190MB.

greets,
Stefan

Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang:
> Hi,
> 
> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote:
>> Dear Wang,
>>
>> can't use v4.8.0 as i always get OOMs and total machine crashes.
>>
>> Complete traces with your patch and some more btrfs patches applied (in
>> the hope in fixes the OOM but it did not):
>> http://pastebin.com/raw/6vmRSDm1
> I didn't see any such OOMs...
> Can you try holger's tree with my patches.
> 
> Regards,
> Xiaoguang Wang
>>
>> Greets,
>> Stefan
>> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
>>> Hi,
>>>
>>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>>>> currently
>>>>>> I cannot confirm that as i do not have anough space to test this
>>>>>> without
>>>>>> compression ;-( But yes i've compression enabled.
>>>>> I might not get you, my poor english :)
>>>>> You mean that you only get ENOSPC error when compression is enabled?
>>>>>
>>>>> And when compression is not enabled, you do not get ENOSPC error?
>>>> I can't tell you. I cannot test with compression not enabled. I do not
>>>> have anough free space on this disk.
>>> I had just sent two patches to fix false enospc error for compression,
>>> please have a try, they fix false enospc error in my test environment.
>>>  btrfs: fix false enospc for compression
>>>  btrfs: improve inode's outstanding_extents computation
>>>
>>> I apply these two patchs in linux upstream tree, the latest commit
>>> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
>>>
>>> Regards,
>>> Xiaoguang Wang
>>>
>>>>>>> I'm trying to fix it.
>>>>>> That sounds good but do you also get the
>>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>>>
>>>>>> kernel messages on umount? if not you might have found another
>>>>>> problem.
>>>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>>>> [ cut here ]
>>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>>>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>>>  880fda777d00 813b69c3 
>>>> c067a099 880fda777d38 810821c6 
>>>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>>>> Call Trace:
>>>> [] dump_stack+0x63/0x90
>>>> [] warn_slowpath_common+0x86/0xc0
>>>> [] warn_slowpath_null+0x1a/0x20
>>>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>>>> [] close_ctree+0x15d/0x330 [btrfs]
>>>> [] btrfs_put_super+0x19/0x20 [btrfs]
>>>> [] generic_shutdown_super+0x6f/0x100
>>>> [] kill_anon_super+0x12/0x20
>>>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>>>> [] deactivate_locked_super+0x43/0x70
>>>> [] deactivate_super+0x5c/0x60
>>>> [] cleanup_mnt+0x3f/0x90
>>>> [] __cleanup_mnt+0x12/0x20
>>>> [] task_work_run+0x81/0xa0
>>>> [] exit_to_usermode_loop+0xb0/0xc0
>>>> [] syscall_return_slowpath+0xd4/0x130
>>>> [] int_ret_from_sys_call+0x25/0x8f
>>>> ---[ end trace cee6ace13018e13e ]---
>>>> [ cu

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-07 Thread Stefan Priebe - Profihost AG
this is what atop shows at mem usage 5 minutes before the crash:

MEM | tot62.8G  | free  198.2M  | cache  56.8G  | buff1.4M |
slab3.5G |  shmem   1.1M |  vmbal   0.0M |  hptot   0.0M |

SWP | tot 3.7G  | free3.2G  |   |  |
  |   |  vmcom   2.8G |  vmlim  35.1G |

Greets,
Stefan

Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang:
> Hi,
> 
> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote:
>> Dear Wang,
>>
>> can't use v4.8.0 as i always get OOMs and total machine crashes.
>>
>> Complete traces with your patch and some more btrfs patches applied (in
>> the hope in fixes the OOM but it did not):
>> http://pastebin.com/raw/6vmRSDm1
> I didn't see any such OOMs...
> Can you try holger's tree with my patches.
> 
> Regards,
> Xiaoguang Wang
>>
>> Greets,
>> Stefan
>> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
>>> Hi,
>>>
>>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>>>> currently
>>>>>> I cannot confirm that as i do not have anough space to test this
>>>>>> without
>>>>>> compression ;-( But yes i've compression enabled.
>>>>> I might not get you, my poor english :)
>>>>> You mean that you only get ENOSPC error when compression is enabled?
>>>>>
>>>>> And when compression is not enabled, you do not get ENOSPC error?
>>>> I can't tell you. I cannot test with compression not enabled. I do not
>>>> have anough free space on this disk.
>>> I had just sent two patches to fix false enospc error for compression,
>>> please have a try, they fix false enospc error in my test environment.
>>>  btrfs: fix false enospc for compression
>>>  btrfs: improve inode's outstanding_extents computation
>>>
>>> I apply these two patchs in linux upstream tree, the latest commit
>>> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
>>>
>>> Regards,
>>> Xiaoguang Wang
>>>
>>>>>>> I'm trying to fix it.
>>>>>> That sounds good but do you also get the
>>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>>>
>>>>>> kernel messages on umount? if not you might have found another
>>>>>> problem.
>>>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>>>> [ cut here ]
>>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>>>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>>>  880fda777d00 813b69c3 
>>>> c067a099 880fda777d38 810821c6 
>>>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>>>> Call Trace:
>>>> [] dump_stack+0x63/0x90
>>>> [] warn_slowpath_common+0x86/0xc0
>>>> [] warn_slowpath_null+0x1a/0x20
>>>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>>>> [] close_ctree+0x15d/0x330 [btrfs]
>>>> [] btrfs_put_super+0x19/0x20 [btrfs]
>>>> [] generic_shutdown_super+0x6f/0x100
>>>> [] kill_anon_super+0x12/0x20
>>>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>>>> [] deactivate_locked_super+0x43/0x70
>>>> [] deactivate_super+0x5c/0x60
>>>> [] cleanup_mnt+0x3f/0x90
>>>> [] __cleanup_mnt+0x12/0x20
>>>> [] task_work_run+0x81/0xa0
>>>> [] exit_to_usermode_loop+0xb0/0xc0
>>>> [] syscall_return_slowpath+0xd4/0x130
>>>> []

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-07 Thread Stefan Priebe - Profihost AG
Am 07.10.2016 um 09:17 schrieb Wang Xiaoguang:
> Hi,
> 
> On 10/07/2016 03:03 PM, Stefan Priebe - Profihost AG wrote:
>> Dear Wang,
>>
>> can't use v4.8.0 as i always get OOMs and total machine crashes.
>>
>> Complete traces with your patch and some more btrfs patches applied (in
>> the hope in fixes the OOM but it did not):
>> http://pastebin.com/raw/6vmRSDm1
> I didn't see any such OOMs...
> Can you try holger's tree with my patches.

Dear wang already tried that. Doesn't help. It also happens only on two
out of three servers.  It starts killing low men processes after time.
But I've no idea where all those memory is consumed. (Have 64gb)

Greets,
Stefan


> Regards,
> Xiaoguang Wang
>>
>> Greets,
>> Stefan
>> Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
>>> Hi,
>>>
>>> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>>>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>>>> currently
>>>>>> I cannot confirm that as i do not have anough space to test this
>>>>>> without
>>>>>> compression ;-( But yes i've compression enabled.
>>>>> I might not get you, my poor english :)
>>>>> You mean that you only get ENOSPC error when compression is enabled?
>>>>>
>>>>> And when compression is not enabled, you do not get ENOSPC error?
>>>> I can't tell you. I cannot test with compression not enabled. I do not
>>>> have anough free space on this disk.
>>> I had just sent two patches to fix false enospc error for compression,
>>> please have a try, they fix false enospc error in my test environment.
>>>  btrfs: fix false enospc for compression
>>>  btrfs: improve inode's outstanding_extents computation
>>>
>>> I apply these two patchs in linux upstream tree, the latest commit
>>> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
>>>
>>> Regards,
>>> Xiaoguang Wang
>>>
>>>>>>> I'm trying to fix it.
>>>>>> That sounds good but do you also get the
>>>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>>>
>>>>>> kernel messages on umount? if not you might have found another
>>>>>> problem.
>>>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>>>> [ cut here ]
>>>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>>>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>>>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>>>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>>>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>>>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>>>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>>>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>>>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>>>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>>>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>>>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>>>  880fda777d00 813b69c3 
>>>> c067a099 880fda777d38 810821c6 
>>>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>>>> Call Trace:
>>>> [] dump_stack+0x63/0x90
>>>> [] warn_slowpath_common+0x86/0xc0
>>>> [] warn_slowpath_null+0x1a/0x20
>>>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>>>> [] close_ctree+0x15d/0x330 [btrfs]
>>>> [] btrfs_put_super+0x19/0x20 [btrfs]
>>>> [] generic_shutdown_super+0x6f/0x100
>>>> [] kill_anon_super+0x12/0x20
>>>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>>>> [] deactivate_locked_super+0x43/0x70
>>>> [] deactivate_super+0x5c/0x60
>>>> [] cleanup_mnt+0x3f/0x90
>>>> [] __cleanup_mnt+0x12/0x20
>>>> [] task_work_run+0x81/0xa0
>>>> [] exit_to_usermode_loop+0xb0/0xc0
>>>> [] syscall_return_slowpath+0xd4/0x130
>>>> [] int_ret_from_sys_call+0x25/0x8f
>>>> ---[ end trace cee6ace13018e13e ]---
>>>> [ cut h

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-07 Thread Stefan Priebe - Profihost AG
Dear Wang,

can't use v4.8.0 as i always get OOMs and total machine crashes.

Complete traces with your patch and some more btrfs patches applied (in
the hope in fixes the OOM but it did not):
http://pastebin.com/raw/6vmRSDm1

Greets,
Stefan
Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
> Hi,
> 
> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>> currently
>>>> I cannot confirm that as i do not have anough space to test this
>>>> without
>>>> compression ;-( But yes i've compression enabled.
>>> I might not get you, my poor english :)
>>> You mean that you only get ENOSPC error when compression is enabled?
>>>
>>> And when compression is not enabled, you do not get ENOSPC error?
>> I can't tell you. I cannot test with compression not enabled. I do not
>> have anough free space on this disk.
> I had just sent two patches to fix false enospc error for compression,
> please have a try, they fix false enospc error in my test environment.
> btrfs: fix false enospc for compression
> btrfs: improve inode's outstanding_extents computation
> 
> I apply these two patchs in linux upstream tree, the latest commit
> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
> 
> Regards,
> Xiaoguang Wang
> 
>>
>>>>> I'm trying to fix it.
>>>> That sounds good but do you also get the
>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>
>>>> kernel messages on umount? if not you might have found another problem.
>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>> [] warn_slowpath_null+0x1a/0x20
>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>> [] close_ctree+0x15d/0x330 [btrfs]
>> [] btrfs_put_super+0x19/0x20 [btrfs]
>> [] generic_shutdown_super+0x6f/0x100
>> [] kill_anon_super+0x12/0x20
>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>> [] deactivate_locked_super+0x43/0x70
>> [] deactivate_super+0x5c/0x60
>> [] cleanup_mnt+0x3f/0x90
>> [] __cleanup_mnt+0x12/0x20
>> [] task_work_run+0x81/0xa0
>> [] exit_to_usermode_loop+0xb0/0xc0
>> [] syscall_return_slowpath+0xd4/0x130
>> [] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace cee6ace13018e13e ]---
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
>> btrfs_free_block_groups+0x365/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 fff

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-06 Thread Stefan Priebe - Profihost AG
Thanks Wang,

i applied them both on top of vanilla v4.8 - i hope this is OK. Will
report back what happens.

Greets,
Stefan

Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
> Hi,
> 
> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>> currently
>>>> I cannot confirm that as i do not have anough space to test this
>>>> without
>>>> compression ;-( But yes i've compression enabled.
>>> I might not get you, my poor english :)
>>> You mean that you only get ENOSPC error when compression is enabled?
>>>
>>> And when compression is not enabled, you do not get ENOSPC error?
>> I can't tell you. I cannot test with compression not enabled. I do not
>> have anough free space on this disk.
> I had just sent two patches to fix false enospc error for compression,
> please have a try, they fix false enospc error in my test environment.
> btrfs: fix false enospc for compression
> btrfs: improve inode's outstanding_extents computation
> 
> I apply these two patchs in linux upstream tree, the latest commit
> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
> 
> Regards,
> Xiaoguang Wang
> 
>>
>>>>> I'm trying to fix it.
>>>> That sounds good but do you also get the
>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>
>>>> kernel messages on umount? if not you might have found another problem.
>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>> [] warn_slowpath_null+0x1a/0x20
>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>> [] close_ctree+0x15d/0x330 [btrfs]
>> [] btrfs_put_super+0x19/0x20 [btrfs]
>> [] generic_shutdown_super+0x6f/0x100
>> [] kill_anon_super+0x12/0x20
>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>> [] deactivate_locked_super+0x43/0x70
>> [] deactivate_super+0x5c/0x60
>> [] cleanup_mnt+0x3f/0x90
>> [] __cleanup_mnt+0x12/0x20
>> [] task_work_run+0x81/0xa0
>> [] exit_to_usermode_loop+0xb0/0xc0
>> [] syscall_return_slowpath+0xd4/0x130
>> [] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace cee6ace13018e13e ]---
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
>> btrfs_free_block_groups+0x365/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>>

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-10-06 Thread Stefan Priebe - Profihost AG
Thanks Wang,

i applied them both on top of vanilla v4.8 - i hope this is OK. Will
report back what happens.

Greets,
Stefan

Am 06.10.2016 um 05:04 schrieb Wang Xiaoguang:
> Hi,
> 
> On 09/29/2016 03:27 PM, Stefan Priebe - Profihost AG wrote:
>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>> currently
>>>> I cannot confirm that as i do not have anough space to test this
>>>> without
>>>> compression ;-( But yes i've compression enabled.
>>> I might not get you, my poor english :)
>>> You mean that you only get ENOSPC error when compression is enabled?
>>>
>>> And when compression is not enabled, you do not get ENOSPC error?
>> I can't tell you. I cannot test with compression not enabled. I do not
>> have anough free space on this disk.
> I had just sent two patches to fix false enospc error for compression,
> please have a try, they fix false enospc error in my test environment.
> btrfs: fix false enospc for compression
> btrfs: improve inode's outstanding_extents computation
> 
> I apply these two patchs in linux upstream tree, the latest commit
> is 41844e36206be90cd4d962ea49b0abc3612a99d0.
> 
> Regards,
> Xiaoguang Wang
> 
>>
>>>>> I'm trying to fix it.
>>>> That sounds good but do you also get the
>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>
>>>> kernel messages on umount? if not you might have found another problem.
>>> Yes, I seem similar messages, you can paste you whole dmesg info here.
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
>> btrfs_free_block_groups+0x346/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>> [] warn_slowpath_null+0x1a/0x20
>> [] btrfs_free_block_groups+0x346/0x430 [btrfs]
>> [] close_ctree+0x15d/0x330 [btrfs]
>> [] btrfs_put_super+0x19/0x20 [btrfs]
>> [] generic_shutdown_super+0x6f/0x100
>> [] kill_anon_super+0x12/0x20
>> [] btrfs_kill_super+0x16/0xa0 [btrfs]
>> [] deactivate_locked_super+0x43/0x70
>> [] deactivate_super+0x5c/0x60
>> [] cleanup_mnt+0x3f/0x90
>> [] __cleanup_mnt+0x12/0x20
>> [] task_work_run+0x81/0xa0
>> [] exit_to_usermode_loop+0xb0/0xc0
>> [] syscall_return_slowpath+0xd4/0x130
>> [] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace cee6ace13018e13e ]---
>> [ cut here ]
>> WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
>> btrfs_free_block_groups+0x365/0x430 [btrfs]()
>> Modules linked in: netconsole xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
>> irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
>> ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
>> btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
>> i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
>> sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
>> CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
>> Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
>>  880fda777d00 813b69c3 
>> c067a099 880fda777d38 810821c6 
>> 880074bf0a00 88103c10c088 88103c10c000 88103c10c098
>> Call Trace:
>> [] dump_stack+0x63/0x90
>> [] warn_slowpath_common+0x86/0xc0
>>

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-29 Thread Stefan Priebe - Profihost AG
Hi,

Am 29.09.2016 um 12:03 schrieb Adam Borowski:
> On Thu, Sep 29, 2016 at 09:27:01AM +0200, Stefan Priebe - Profihost AG wrote:
>> Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>>>> currently
>>>> I cannot confirm that as i do not have anough space to test this without
>>>> compression ;-( But yes i've compression enabled.
>>> I might not get you, my poor english :)
>>> You mean that you only get ENOSPC error when compression is enabled?
>>>
>>> And when compression is not enabled, you do not get ENOSPC error?
>>
>> I can't tell you. I cannot test with compression not enabled. I do not
>> have anough free space on this disk.
> 
> Disabling compression doesn't immediately require any space -- it affects
> only newly written data.  What you already have remains in the old
> compression setting, unless you defrag everything (a side effect of
> defragging is switching existing extents to the new compression mode).

Yes i know that but most workload is creating reflinks to old files and
modify data in them. So to create a good test i need to defrag and
uncompress all those files.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-29 Thread Stefan Priebe - Profihost AG
Am 29.09.2016 um 09:13 schrieb Wang Xiaoguang:
>>> I found that compress sometime report ENOSPC error even in 4.8-rc8,
>>> currently
>> I cannot confirm that as i do not have anough space to test this without
>> compression ;-( But yes i've compression enabled.
> I might not get you, my poor english :)
> You mean that you only get ENOSPC error when compression is enabled?
> 
> And when compression is not enabled, you do not get ENOSPC error?

I can't tell you. I cannot test with compression not enabled. I do not
have anough free space on this disk.

>>> I'm trying to fix it.
>> That sounds good but do you also get the
>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>
>> kernel messages on umount? if not you might have found another problem.
> Yes, I seem similar messages, you can paste you whole dmesg info here.

[ cut here ]
WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5790
btrfs_free_block_groups+0x346/0x430 [btrfs]()
Modules linked in: netconsole xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
CPU: 2 PID: 5187 Comm: umount Tainted: G O 4.4.22+63-ph #1
Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
 880fda777d00 813b69c3 
c067a099 880fda777d38 810821c6 
880074bf0a00 88103c10c088 88103c10c000 88103c10c098
Call Trace:
[] dump_stack+0x63/0x90
[] warn_slowpath_common+0x86/0xc0
[] warn_slowpath_null+0x1a/0x20
[] btrfs_free_block_groups+0x346/0x430 [btrfs]
[] close_ctree+0x15d/0x330 [btrfs]
[] btrfs_put_super+0x19/0x20 [btrfs]
[] generic_shutdown_super+0x6f/0x100
[] kill_anon_super+0x12/0x20
[] btrfs_kill_super+0x16/0xa0 [btrfs]
[] deactivate_locked_super+0x43/0x70
[] deactivate_super+0x5c/0x60
[] cleanup_mnt+0x3f/0x90
[] __cleanup_mnt+0x12/0x20
[] task_work_run+0x81/0xa0
[] exit_to_usermode_loop+0xb0/0xc0
[] syscall_return_slowpath+0xd4/0x130
[] int_ret_from_sys_call+0x25/0x8f
---[ end trace cee6ace13018e13e ]---
[ cut here ]
WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:5791
btrfs_free_block_groups+0x365/0x430 [btrfs]()
Modules linked in: netconsole xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
 880fda777d00 813b69c3 
c067a099 880fda777d38 810821c6 
880074bf0a00 88103c10c088 88103c10c000 88103c10c098
Call Trace:
[] dump_stack+0x63/0x90
[] warn_slowpath_common+0x86/0xc0
[] warn_slowpath_null+0x1a/0x20
[] btrfs_free_block_groups+0x365/0x430 [btrfs]
[] close_ctree+0x15d/0x330 [btrfs]
[] btrfs_put_super+0x19/0x20 [btrfs]
[] generic_shutdown_super+0x6f/0x100
[] kill_anon_super+0x12/0x20
[] btrfs_kill_super+0x16/0xa0 [btrfs]
[] deactivate_locked_super+0x43/0x70
[] deactivate_super+0x5c/0x60
[] cleanup_mnt+0x3f/0x90
[] __cleanup_mnt+0x12/0x20
[] task_work_run+0x81/0xa0
[] exit_to_usermode_loop+0xb0/0xc0
[] syscall_return_slowpath+0xd4/0x130
[] int_ret_from_sys_call+0x25/0x8f
---[ end trace cee6ace13018e13f ]---
[ cut here ]
WARNING: CPU: 2 PID: 5187 at fs/btrfs/extent-tree.c:10151
btrfs_free_block_groups+0x291/0x430 [btrfs]()
Modules linked in: netconsole xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding x86_pkg_temp_thermal coretemp kvm_intel kvm
irqbypass sb_edac crc32_pclmul edac_core i2c_i801 i40e(O) vxlan
ip6_udp_tunnel udp_tunnel shpchp ipmi_si ipmi_msghandler button loop
btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq igb i2c_algo_bit
i2c_core usbhid raid1 md_mod xhci_pci sg ehci_pci xhci_hcd ehci_hcd
sd_mod ahci usbcore ptp libahci usb_common pps_core aacraid
CPU: 2 PID: 5187 Comm: umount Tainted: G W O 4.4.22+63-ph #1
Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
 880fda777d00 813b69c3 

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-29 Thread Stefan Priebe - Profihost AG

Am 29.09.2016 um 08:55 schrieb Wang Xiaoguang:
> Hi,
> 
> On 09/29/2016 02:49 PM, Stefan Priebe - Profihost AG wrote:
>> Hi,
>>
>> Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang:
>>> OK, I see.
>>> But given that you often run into enospc errors, can you work out a
>>> reproduce
>>> script according to you work load. That will give us great help.
> You got ENOSPC errors only when you have compress enabled?
> 
> I found that compress sometime report ENOSPC error even in 4.8-rc8,
> currently

I cannot confirm that as i do not have anough space to test this without
compression ;-( But yes i've compression enabled.

> I'm trying to fix it.

That sounds good but do you also get the
BTRFS: space_info 4 has 18446742286429913088 free, is not full

kernel messages on umount? if not you might have found another problem.

Stefan

> 
> Regards,
> Xiaoguang Wang
> 
>> I tried hard to reproduce it but i can't get it to reproduce with a test
>> script. Any ideas?
>>
>> Stefan
>>
>>> Reagrds,
>>> Xiaoguang Wang
>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>>> Regards,
>>>>> Xiaoguang Wang
>>>>>> Greets,
>>>>>> Stefan
>>>>>> -- 
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-btrfs" in
>>>>>> the body of a message to majord...@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>>
>>>>>
>>>
>>>
>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-29 Thread Stefan Priebe - Profihost AG
Hi,

Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang:
> OK, I see.
> But given that you often run into enospc errors, can you work out a
> reproduce
> script according to you work load. That will give us great help.

I tried hard to reproduce it but i can't get it to reproduce with a test
script. Any ideas?

Stefan

> 
> Reagrds,
> Xiaoguang Wang
> 
>>
>> Greets,
>> Stefan
>>
>>> Regards,
>>> Xiaoguang Wang
 Greets,
 Stefan
 -- 
 To unsubscribe from this list: send the line "unsubscribe
 linux-btrfs" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


>>>
>>>
>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-28 Thread Stefan Priebe - Profihost AG
Am 28.09.2016 um 15:44 schrieb Holger Hoffstätte:

>> Good idea but it does not. I hope i can reproduce this with my already
>> existing testscript which i've now bumped to use a 37TB partition and
>> big files rather than a 15GB part and small files. If i can reproduce it
>> i can also check whether disabling compression fixes this.
> 
> Great. Remember to undo the compression on existing files, or create
> them from scratch.

I create files from scratch - but currently i can't trigger the problem
with my testscript. But even in production load it's not that easy. I
need to process 60-120 files before the error is triggered.

>> No that's not the case. No rsync nor inplace is involved. I'm dumping
>> differences directly from ceph and put them on top of a base image but
>> only for 7 days. So it's not endless fragmenting the file. After 7 days
>> a clean whole image is dumped.
> 
> That sounds sane but it's also not at all how you described things to me
> previosuly ;) But OK.
I'm sorry. May be my english is just bad, you got me wrong or was drunk
*joke*. It never changed.

> How do you "dump differences directly from
> Ceph"? I'd assume the VM images are RBDs, but it sounds you're somehow
> using overlayfs.

You can use rbd diff to export differences between two snapshots. So no
overlayfs involved.

> Anyway..something is off and you successfully cause it while other
> people apparently do not.
Sure - i know that. But i still don't want to switch to zfs.

> Do you still use those nonstandard mount
> options with extremely long transaction flush times?
No i removed commit=300 just to be sure they do not cause this issue.

Sure,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-28 Thread Stefan Priebe - Profihost AG
Dear Holger,

first thanks for your long e-mail.

Am 28.09.2016 um 14:47 schrieb Holger Hoffstätte:
> On 09/28/16 13:35, Wang Xiaoguang wrote:
>> hello,
>>
>> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote:
>>> Dear list,
>>>
>>> is there any chance anybody wants to work with me on the following issue?
>> Though I'm also somewhat new to btrfs, but I'd like to.
>>
>>>
>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>> BTRFS: space_info total=98247376896, used=77036814336, pinned=0,
>>> reserved=0, may_use=1808490201088, readonly=0
>>>
>>> i get this nearly every day.
>>>
>>> Here are some msg collected from today and yesterday from different servers:
>>> | BTRFS: space_info 4 has 18446742182612910080 free, is not full |
>>> | BTRFS: space_info 4 has 18446742254739439616 free, is not full |
>>> | BTRFS: space_info 4 has 18446743980225085440 free, is not full |
>>> | BTRFS: space_info 4 has 18446743619906420736 free, is not full |
>>> | BTRFS: space_info 4 has 18446743647369576448 free, is not full |
>>> | BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>
>>> What i tried so far without success:
>>> - use vanilla 4.8-rc8 kernel
>>> - use latest vanilla 4.4 kernel
>>> - use latest 4.4 kernel + patches from holger hoffstaette
> 
> Was that 4.4.22? It contains a patch by Goldwyn Rodrigues called
> "Prevent qgroup->reserved from going subzero" which should prevent
> this from happening. This should only affect filesystems with enabled
> quota; you said you didn't have quota enabled, yet some quota-only
> patches caused problems on your system (despite being scheduled for
> 4.9 and apparently working fine everywhere else, even when I
> specifically tested them *with* quota enabled).

Yes this is 4.4.22 and no i don't have qgroups enabled so it can't help.

# btrfs qgroup show /path/
ERROR: can't perform the search - No such file or directory
ERROR: can't list qgroups: No such file or director

This is the same output on all backup machines.

> It means either:
> - you tried my patchset for 4.4.21 (i.e. *without* the above patch)
>   and should bump to .22 right away

No it's 4.4.22

> - you _do_ have qgroups enabled for some reason (systemd?)

No see above - but yes i use systemd.

> - your fs is corrupted and needs nuking

If this is the case all FS on 5 servers must be corrupted and all of
them were installed at a different date / year. The newest one just 5
month ago with kernel 4.1 the others with 3.18. Also a lot of other
systems with just 100-900GB of space are working fine.

> - you did something else entirely
No idea what this could be.

> There is also the chance that your use of compress-force (or rather
> compression in general) causes leakage; compression runs asynchronously
> and I wouldn't be surprised if that is still full of racy races..which
> would be unfortunate, but you could try to disable compression for a
> while and see what  happens, assuming the space requirements allow this
> experiment.
Good idea but it does not. I hope i can reproduce this with my already
existing testscript which i've now bumped to use a 37TB partition and
big files rather than a 15GB part and small files. If i can reproduce it
i can also check whether disabling compression fixes this.

What speaks against this is that i've also a MariaDB Server which runs
fine since two years with compress-force but uses only < 100GB files and
also does not create and remove them on a daily basis.

> You have also not told us whether this happens only on one (potentially
> corrupted/confused) fs or on every one - my impression was that you have
> several sharded backup filesystems/machines; not sure if that is still
> the case. If it happens only on one specific fs chances are it's hosed.

It happens on all of them - sorry if i missed this.

>> I also met enospc error in 4.8-rc6 when doing big files create and delete 
>> tests,
>> for my cases, I have written some patches to fix it.
>> Would you please apply my patches to have a try:
>> btrfs: try to satisfy metadata requests when every flush_space() returns
>> btrfs: try to write enough delalloc bytes when reclaiming metadata space
>> btrfs: make shrink_delalloc() try harder to reclaim metadata space
> 
> These are all in my series for 4.4.22 and seem to work fine, however
> Stefan's workload has nothing directly to do with big files; instead
> it's the worst case scenario in terms of fragmentation (of huge files) and
> a huge number of extents: incremental backups of VMs via rsync --inplace 
> with forced compression.

No that's not the case. No rsync n

Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-28 Thread Stefan Priebe - Profihost AG
Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang:
> hello,
> 
> On 09/28/2016 08:02 PM, Stefan Priebe - Profihost AG wrote:
>> Hi Xiaoguang Wang,
>>
>> Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang:
>>> hello,
>>>
>>> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote:
>>>> Dear list,
>>>>
>>>> is there any chance anybody wants to work with me on the following
>>>> issue?
>>> Though I'm also somewhat new to btrfs, but I'd like to.
>>>
>>>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>> BTRFS: space_info total=98247376896, used=77036814336, pinned=0,
>>>> reserved=0, may_use=1808490201088, readonly=0
>>>>
>>>> i get this nearly every day.
>>>>
>>>> Here are some msg collected from today and yesterday from different
>>>> servers:
>>>> | BTRFS: space_info 4 has 18446742182612910080 free, is not full |
>>>> | BTRFS: space_info 4 has 18446742254739439616 free, is not full |
>>>> | BTRFS: space_info 4 has 18446743980225085440 free, is not full |
>>>> | BTRFS: space_info 4 has 18446743619906420736 free, is not full |
>>>> | BTRFS: space_info 4 has 18446743647369576448 free, is not full |
>>>> | BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>>>
>>>> What i tried so far without success:
>>>> - use vanilla 4.8-rc8 kernel
>>>> - use latest vanilla 4.4 kernel
>>>> - use latest 4.4 kernel + patches from holger hoffstaette
>>>> - use clear_cache,space_cache=v2
>>>> - use clear_cache,space_cache=v1
>>>>
>>>> But all tries result in ENOSPC after a short period of time doing
>>>> backups.
>>> I also met enospc error in 4.8-rc6 when doing big files create and
>>> delete tests,
>>> for my cases, I have written some patches to fix it.
>>> Would you please apply my patches to have a try:
>>> btrfs: try to satisfy metadata requests when every flush_space() returns
>>> btrfs: try to write enough delalloc bytes when reclaiming metadata space
>>> btrfs: make shrink_delalloc() try harder to reclaim metadata space
>>> You can find them in btrfs mail list.
>> those are already in the patchset from holger:
>>
>> So i have these in my testing patchset (latest 4.4 kernel + patches from
>> holger hoffstaette):
>>
>> btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch
>>
>>
>> btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch
>>
>>
>> btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch
>>
> OK, I see.
> But given that you often run into enospc errors, can you work out a
> reproduce
> script according to you work load. That will give us great help.

I already tried that but it wasn't working. It seems i need a test
device with +20TB and i need creating file that big in the tests. But
that isn't easy. Currently i've no test hardware that big. May be i
should try that on a production server.

Stefan

> Reagrds,
> Xiaoguang Wang
> 
>>
>> Greets,
>> Stefan
>>
>>> Regards,
>>> Xiaoguang Wang
>>>> Greets,
>>>> Stefan
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-btrfs" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>
>>>
>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-28 Thread Stefan Priebe - Profihost AG
Hi Xiaoguang Wang,

Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang:
> hello,
> 
> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote:
>> Dear list,
>>
>> is there any chance anybody wants to work with me on the following issue?
> Though I'm also somewhat new to btrfs, but I'd like to.
> 
>>
>> BTRFS: space_info 4 has 18446742286429913088 free, is not full
>> BTRFS: space_info total=98247376896, used=77036814336, pinned=0,
>> reserved=0, may_use=1808490201088, readonly=0
>>
>> i get this nearly every day.
>>
>> Here are some msg collected from today and yesterday from different
>> servers:
>> | BTRFS: space_info 4 has 18446742182612910080 free, is not full |
>> | BTRFS: space_info 4 has 18446742254739439616 free, is not full |
>> | BTRFS: space_info 4 has 18446743980225085440 free, is not full |
>> | BTRFS: space_info 4 has 18446743619906420736 free, is not full |
>> | BTRFS: space_info 4 has 18446743647369576448 free, is not full |
>> | BTRFS: space_info 4 has 18446742286429913088 free, is not full
>>
>> What i tried so far without success:
>> - use vanilla 4.8-rc8 kernel
>> - use latest vanilla 4.4 kernel
>> - use latest 4.4 kernel + patches from holger hoffstaette
>> - use clear_cache,space_cache=v2
>> - use clear_cache,space_cache=v1
>>
>> But all tries result in ENOSPC after a short period of time doing
>> backups.
> I also met enospc error in 4.8-rc6 when doing big files create and
> delete tests,
> for my cases, I have written some patches to fix it.
> Would you please apply my patches to have a try:
> btrfs: try to satisfy metadata requests when every flush_space() returns
> btrfs: try to write enough delalloc bytes when reclaiming metadata space
> btrfs: make shrink_delalloc() try harder to reclaim metadata space
> You can find them in btrfs mail list.

those are already in the patchset from holger:

So i have these in my testing patchset (latest 4.4 kernel + patches from
holger hoffstaette):

btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch

btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch

btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch

Greets,
Stefan

> 
> Regards,
> Xiaoguang Wang
>>
>> Greets,
>> Stefan
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS: space_info 4 has 18446742286429913088 free, is not full

2016-09-28 Thread Stefan Priebe - Profihost AG
Dear list,

is there any chance anybody wants to work with me on the following issue?

BTRFS: space_info 4 has 18446742286429913088 free, is not full
BTRFS: space_info total=98247376896, used=77036814336, pinned=0,
reserved=0, may_use=1808490201088, readonly=0

i get this nearly every day.

Here are some msg collected from today and yesterday from different servers:
| BTRFS: space_info 4 has 18446742182612910080 free, is not full |
| BTRFS: space_info 4 has 18446742254739439616 free, is not full |
| BTRFS: space_info 4 has 18446743980225085440 free, is not full |
| BTRFS: space_info 4 has 18446743619906420736 free, is not full |
| BTRFS: space_info 4 has 18446743647369576448 free, is not full |
| BTRFS: space_info 4 has 18446742286429913088 free, is not full

What i tried so far without success:
- use vanilla 4.8-rc8 kernel
- use latest vanilla 4.4 kernel
- use latest 4.4 kernel + patches from holger hoffstaette
- use clear_cache,space_cache=v2
- use clear_cache,space_cache=v1

But all tries result in ENOSPC after a short period of time doing backups.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: deadlock with btrfs heavy i/o and kswapd

2016-09-27 Thread Stefan Priebe - Profihost AG
Hi Chris,

today i had this again. But i can't see any stack traces. I just see.

INFO: kworker/u128:5:24301 blocked for more than 120 seconds.
...
INFO: kworker/u128:5:24301 blocked for more than 120 seconds.
...
INFO: task mysqld:929 blocked for more ...
...

sysrq w just prints:
sysrq: SysRq: Show Blcoked State

but nothing more.

Stefan
Am 22.09.2016 um 16:28 schrieb Chris Mason:
> 
> 
> On 09/22/2016 02:41 AM, Stefan Priebe - Profihost AG wrote:
>> Hi,
>>
>> i always encounter btrfs deadlocks / hung tasks, when i have a lot of
>> cached mem and i'm doing heavy rsync --inplace operations in my system
>> from btrfs zlib compressed disk A to btrfs zlib compressed disk B.
>>
>> The last output i see in this case is kswapd0 running for a long time at
>> 100% cpu. Then the whole system get's stuck. I cannot connect to ssh
>> anymore but the kernel still prints hung tasks every few minutes.
>>
>> May be relevant the system has NO swap.
>>
>> vm.vfs_cache_pressure = 100
>> vm.swappiness = 50
> 
> Are you able to capture the stack dumps?  A sysrq-w would really help.
> 
> -chris
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ENOSPACE linux 4.8-rc6 BTRFS: space_info 4 has 18446743524878843904 free, is not full

2016-09-13 Thread Stefan Priebe - Profihost AG
Hi,

this is vanilla linux 4.8-rc6 and i still have ENOSPC issues with btrfs
- caused by wrong space_tree entries.

[ 9736.921995] [ cut here ]
[ 9736.923342] WARNING: CPU: 1 PID: 23942 at fs/btrfs/extent-tree.c:5734
btrfs_free_block_groups+0x35e/0x440 [btrfs]
[ 9736.926229] Modules linked in: netconsole xt_multiport iptable_filter
ip_tables x_tables 8021q garp bonding sb_edac edac_core
x86_pkg_temp_thermal coretemp kvm_intel kvm ipmi_si irqbypass i2c_i801
crc32_pclmul i2c_smbus shpchp ghash_clmulni_intel ipmi_msghandler button
loop btrfs dm_mod raid10 raid0 multipath linear raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
raid1 md_mod sg sd_mod usbhid xhci_pci igb ehci_pci i2c_algo_bit
xhci_hcd ehci_hcd i40e i2c_core ahci usbcore ptp usb_common libahci
aacraid pps_core
[ 9736.941228] CPU: 1 PID: 23942 Comm: umount Not tainted 4.8.0-rc6 #6
[ 9736.943497] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0
12/17/2015
[ 9736.945561]   8a571d3d3cf8 a33de0b3

[ 9736.947720]   8a571d3d3d38 a3084e01
1666a30e2499
[ 9736.949880]   8a577b216088 8a577b216000
8a577aa17200
[ 9736.952043] Call Trace:
[ 9736.952711]  [] dump_stack+0x63/0x90
[ 9736.954139]  [] __warn+0xd1/0xf0
[ 9736.955466]  [] warn_slowpath_null+0x1d/0x20
[ 9737.022831]  [] btrfs_free_block_groups+0x35e/0x440
[btrfs]
[ 9737.091125]  [] close_ctree+0x15d/0x340 [btrfs]
[ 9737.159547]  [] btrfs_put_super+0x19/0x20 [btrfs]
[ 9737.227648]  [] generic_shutdown_super+0x6f/0x100
[ 9737.295227]  [] kill_anon_super+0x12/0x20
[ 9737.362199]  [] btrfs_kill_super+0x16/0xa0 [btrfs]
[ 9737.428716]  [] deactivate_locked_super+0x43/0x70
[ 9737.494608]  [] deactivate_super+0x5c/0x60
[ 9737.559338]  [] cleanup_mnt+0x3f/0x90
[ 9737.623414]  [] __cleanup_mnt+0x12/0x20
[ 9737.687439]  [] task_work_run+0x7e/0xa0
[ 9737.750376]  [] exit_to_usermode_loop+0xb0/0xc0
[ 9737.813436]  [] do_syscall_64+0x189/0x1f0
[ 9737.875948]  [] entry_SYSCALL64_slow_path+0x25/0x25
[ 9737.938449] ---[ end trace 767418320c59f391 ]---
[ 9738.000649] [ cut here ]
[ 9738.062721] WARNING: CPU: 1 PID: 23942 at fs/btrfs/extent-tree.c:5735
btrfs_free_block_groups+0x37d/0x440 [btrfs]
[ 9738.128037] Modules linked in: netconsole xt_multiport iptable_filter
ip_tables x_tables 8021q garp bonding sb_edac edac_core
x86_pkg_temp_thermal coretemp kvm_intel kvm ipmi_si irqbypass i2c_i801
crc32_pclmul i2c_smbus shpchp ghash_clmulni_intel ipmi_msghandler button
loop btrfs dm_mod raid10 raid0 multipath linear raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
raid1 md_mod sg sd_mod usbhid xhci_pci igb ehci_pci i2c_algo_bit
xhci_hcd ehci_hcd i40e i2c_core ahci usbcore ptp usb_common libahci
aacraid pps_core
[ 9738.487487] CPU: 1 PID: 23942 Comm: umount Tainted: GW
4.8.0-rc6 #6
[ 9738.564325] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0
12/17/2015
[ 9738.641240]   8a571d3d3cf8 a33de0b3

[ 9738.719075]   8a571d3d3d38 a3084e01
1667a30e2499
[ 9738.796597]   8a577b216088 8a577b216000
8a577aa17200
[ 9738.874120] Call Trace:
[ 9738.950936]  [] dump_stack+0x63/0x90
[ 9739.028733]  [] __warn+0xd1/0xf0
[ 9739.106261]  [] warn_slowpath_null+0x1d/0x20
[ 9739.184672]  [] btrfs_free_block_groups+0x37d/0x440
[btrfs]
[ 9739.263666]  [] close_ctree+0x15d/0x340 [btrfs]
[ 9739.341059]  [] btrfs_put_super+0x19/0x20 [btrfs]
[ 9739.416799]  [] generic_shutdown_super+0x6f/0x100
[ 9739.490619]  [] kill_anon_super+0x12/0x20
[ 9739.562240]  [] btrfs_kill_super+0x16/0xa0 [btrfs]
[ 9739.632301]  [] deactivate_locked_super+0x43/0x70
[ 9739.700692]  [] deactivate_super+0x5c/0x60
[ 9739.767536]  [] cleanup_mnt+0x3f/0x90
[ 9739.833250]  [] __cleanup_mnt+0x12/0x20
[ 9739.898076]  [] task_work_run+0x7e/0xa0
[ 9739.962977]  [] exit_to_usermode_loop+0xb0/0xc0
[ 9740.028080]  [] do_syscall_64+0x189/0x1f0
[ 9740.092352]  [] entry_SYSCALL64_slow_path+0x25/0x25
[ 9740.156898] ---[ end trace 767418320c59f392 ]---
[ 9740.221712] [ cut here ]
[ 9740.286589] WARNING: CPU: 1 PID: 23942 at
fs/btrfs/extent-tree.c:10062 btrfs_free_block_groups+0x2a9/0x440 [btrfs]
[ 9740.354299] Modules linked in: netconsole xt_multiport iptable_filter
ip_tables x_tables 8021q garp bonding sb_edac edac_core
x86_pkg_temp_thermal coretemp kvm_intel kvm ipmi_si irqbypass i2c_i801
crc32_pclmul i2c_smbus shpchp ghash_clmulni_intel ipmi_msghandler button
loop btrfs dm_mod raid10 raid0 multipath linear raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
raid1 md_mod sg sd_mod usbhid xhci_pci igb ehci_pci i2c_algo_bit
xhci_hcd ehci_hcd i40e i2c_core ahci usbcore ptp usb_common libahci
aacraid pps_core
[ 9740.720097] CPU: 1 PID: 23942 Comm: umount Tainted: GW
4.8.0-rc6 #6
[ 9740.797412] 

btrfs_endio_write_helper hard lock blocked qorkqueues with kernel 4.8-rc5

2016-09-10 Thread Stefan Priebe - Profihost AG
Hi,

today i've seen this one with 4.8-rc5 and the system was going to be
unresponsible.

BUG: workqueue lockup - pool cpus=14 node=1 flags=0x0 nice=0 stuck for 33s!
BUG: workqueue lockup - pool cpus=14 node=1 flags=0x0 nice=-20 stuck for
33s!
Showing busy workqueues and worker pools:
workqueue kblockd: flags=0x18
pwq 29: cpus=14 node=1 flags=0x0 nice=-20 active=9/256
pending: cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue,
cfq_kick_queue, cfq_kick_queue, cfq_kick_queue, cfq_kick_queue,
cfq_kick_queue
workqueue vmstat: flags=0xc
pwq 28: cpus=14 node=1 flags=0x0 nice=0 active=1/256
pending: vmstat_update
workqueue btrfs-endio-write: flags=0xe
pwq 66: cpus=8-15,24-31 node=1 flags=0x4 nice=0 active=8/8
in-flight: 12942:btrfs_endio_write_helper [btrfs],
11348:btrfs_endio_write_helper [btrfs], 11350:btrfs_endio_write_helper
[btrfs], 5472:btrfs_endio_write_helper [btrfs],
3277:btrfs_endio_write_helper [btrfs], 13523:btrfs_endio_write_helper
[btrfs], 5477:btrfs_endio_write_helper [btrfs],
5471:btrfs_endio_write_helper [btrfs]
delayed: btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper
[btrfs], btrfs_endio_write_helper [btrfs], btrfs_endio_write_helper

Re: [PATCH v3] btrfs: should block unused block groups deletion work when allocating data space

2016-09-10 Thread Stefan Priebe - Profihost AG
Thanks,

this one works fine. No deadlocks.

Stefan

Am 09.09.2016 um 10:17 schrieb Wang Xiaoguang:
> cleaner_kthread() may run at any time, in which it'll call 
> btrfs_delete_unused_bgs()
> to delete unused block groups. Because this work is asynchronous, it may also 
> result
> in false ENOSPC error. Please see below race window:
> 
>CPU1   | CPU2
>   |
> |-> btrfs_alloc_data_chunk_ondemand() |-> cleaner_kthread()
> |-> do_chunk_alloc()  |   |
> |   assume it returns ENOSPC, which means |   |
> |   btrfs_space_info is full and have free|   |
> |   space to satisfy data request.|   |
> | |   |- > 
> btrfs_delete_unused_bgs()
> | |   |it will decrease 
> btrfs_space_info
> | |   |total_bytes and make
> | |   |btrfs_space_info is 
> not full.
> | |   |
> In this case, we may get ENOSPC error, but btrfs_space_info is not full.
> 
> To fix this issue, in btrfs_alloc_data_chunk_ondemand(), if we need to call
> do_chunk_alloc() to allocating new chunk, we should block 
> btrfs_delete_unused_bgs().
> Here we introduce a new struct rw_semaphore bg_delete_sem to do this job.
> 
> Indeed there is already a "struct mutex delete_unused_bgs_mutex", but it's 
> mutex,
> we can not use it for this purpose. Of course, we can re-define it to be 
> struct
> rw_semaphore, then use it in btrfs_alloc_data_chunk_ondemand(). Either method 
> will
> work.
> 
> But given that delete_unused_bgs_mutex's name length is longer than 
> bg_delete_sem,
> I choose the first method, to create a new struct rw_semaphore bg_delete_sem 
> and
> delete delete_unused_bgs_mutex :)
> 
> Reported-by: Stefan Priebe <s.pri...@profihost.ag>
> Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com>
> ---
> V2: fix a deadlock revealed by fstests case btrfs/071, we call
> start_transaction() before in down_write(bg_delete_sem) in
> btrfs_delete_unused_bgs().
> 
> v3: Stefan Priebe reported another similar deadlock, so here we choose
> to not call down_read(bg_delete_sem) for free space inode in
> btrfs_alloc_data_chunk_ondemand(). Meanwhile because we only do the
> data space reservation for free space cache in the transaction context,
> btrfs_delete_unused_bgs() will either have finished its job, or start
> a new transaction waiting current transaction to complete, there will
> be no unused block groups to be deleted, so it's safe to not call
> down_read(bg_delete_sem)
> ---
> ---
>  fs/btrfs/ctree.h   |  2 +-
>  fs/btrfs/disk-io.c | 13 +--
>  fs/btrfs/extent-tree.c | 59 
> --
>  fs/btrfs/volumes.c | 42 +--
>  4 files changed, 76 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index eff3993..fa78ef9 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -788,6 +788,7 @@ struct btrfs_fs_info {
>   struct mutex cleaner_mutex;
>   struct mutex chunk_mutex;
>   struct mutex volume_mutex;
> + struct rw_semaphore bg_delete_sem;
>  
>   /*
>* this is taken to make sure we don't set block groups ro after
> @@ -1068,7 +1069,6 @@ struct btrfs_fs_info {
>   spinlock_t unused_bgs_lock;
>   struct list_head unused_bgs;
>   struct mutex unused_bg_unpin_mutex;
> - struct mutex delete_unused_bgs_mutex;
>  
>   /* For btrfs to record security options */
>   struct security_mnt_opts security_opts;
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 54bc8c7..3cdbd05 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1868,12 +1868,11 @@ static int cleaner_kthread(void *arg)
>   btrfs_run_defrag_inodes(root->fs_info);
>  
>   /*
> -  * Acquires fs_info->delete_unused_bgs_mutex to avoid racing
> -  * with relocation (btrfs_relocate_chunk) and relocation
> -  * acquires fs_info->cleaner_mutex (btrfs_relocate_block_group)
> -  * after acquiring fs_info->delete_unused_bgs_mutex. So we
> -  * can't hold, nor need to, fs_info->cleaner_mutex when deleting
> -  * unused block groups.
> +  * Acquires fs_info->bg_delete_sem to avoid racing with
> +  * relocation (btrfs_relo

Re: btrfs and systemd

2016-08-30 Thread Stefan Priebe - Profihost AG
Am 29.08.2016 um 13:33 schrieb Timofey Titovets:
> Do you try: nofail,noauto,x-systemd.automount ?

sure this fails too as it has the same timeout in systemd.

Mr. Poettering has recommanded me todo the following:
# mkdir -p /etc/systemd/system/$(systemd-escape --suffix=mount -p
/foo/bar/baz).d/
# cat > /etc/systemd/system/$(systemd-escape --suffix=mount -p
/foo/bar/baz).d/timeout.conf < 2016-08-29 9:28 GMT+03:00 Stefan Priebe - Profihost AG 
> <s.pri...@profihost.ag>:
>> Hi Qu,
>>
>> Am 29.08.2016 um 03:48 schrieb Qu Wenruo:
>>>
>>>
>>> At 08/29/2016 04:15 AM, Stefan Priebe - Profihost AG wrote:
>>>> Hi,
>>>>
>>>> i'm trying to get my 60TB btrfs volume to mount with systemd at boot.
>>>> But this always fails with: "mounting timed out. Stopping." after 90s.
>>>
>>> 60TB is quite large, and under most case it will already cause mount
>>> speed problem.
>>>
>>> In our test environment, filling a fs with 16K small files to 2T (just
>>> 128K files)will already slow the mount process to 10s.
>>>
>>> For larger fs, or more specifically, large extent tree, will slow the
>>> mount process obviously.
>>>
>>> The root fix will need a rework of extent tree.
>>> AFAIK Josef is working on the rework.
>>>
>>> So the btrfs fix will need some time.
>>
>> thanks but i've no problem with the long mount time (in my case 6
>> minutes) i'm just wondering how to live with it with systemd. As it
>> always cancels the mount process after 90s and i see no fstab option to
>> change this.
>>
>> Greets,
>> Stefan
>>
>>>
>>> Thanks,
>>> Qu
>>>>
>>>> I can't find any fstab setting for systemd to higher this timeout.
>>>> There's just  the x-systemd.device-timeout but this controls how long to
>>>> wait for the device and not for the mount command.
>>>>
>>>> Is there any solution for big btrfs volumes and systemd?
>>>>
>>>> Greets,
>>>> Stefan
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-08-29 Thread Stefan Priebe - Profihost AG
Hi Josef,

this still hapens with current 4.8-rc* releases. Anything i can do to
debug this? May be insert some code to check for an under or overflow in
the code?

Stefan

Am 14.08.2016 um 17:22 schrieb Stefan Priebe - Profihost AG:
> Hi Josef,
> 
> anything i could do or test? Results with a vanilla next branch are the
> same.
> 
> Stefan
> 
> Am 11.08.2016 um 08:09 schrieb Stefan Priebe - Profihost AG:
>> Hello,
>>
>> the backtrace and info on umount looks the same:
>>
>> [241910.341124] [ cut here ]
>> [241910.379991] WARNING: CPU: 1 PID: 26664 at
>> fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs]
>> [241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT
>> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
>> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
>> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
>> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
>> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
>> [241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted
>> 4.7.0-rc6-29043-g8b8b08c #1
>> [241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
>> 02/18/2015
>> [241910.723716]   8808d104bca8 bd3d83cf
>> 
>> [241910.779309]   8808d104bcf8 bd085615
>> 8808d104bd08
>> [241910.835143]  16455a3410a8 0047a000 
>> 8808469e2088
>> [241910.891882] Call Trace:
>> [241910.947624]  [] dump_stack+0x63/0x84
>> [241911.003714]  [] __warn+0xe5/0x100
>> [241911.060167]  [] warn_slowpath_null+0x1d/0x20
>> [241911.117422]  []
>> btrfs_free_block_groups+0x370/0x410 [btrfs]
>> [241911.175975]  [] close_ctree+0x15b/0x330 [btrfs]
>> [241911.235170]  [] btrfs_put_super+0x19/0x20 [btrfs]
>> [241911.294638]  [] generic_shutdown_super+0x6f/0x100
>> [241911.353005]  [] kill_anon_super+0x16/0x30
>> [241911.409832]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [241911.466467]  [] deactivate_locked_super+0x51/0x90
>> [241911.522602]  [] deactivate_super+0x4e/0x70
>> [241911.577979]  [] cleanup_mnt+0x43/0x90
>> [241911.633188]  [] __cleanup_mnt+0x12/0x20
>> [241911.688146]  [] task_work_run+0x81/0xb0
>> [241911.742740]  [] exit_to_usermode_loop+0x66/0x95
>> [241911.797039]  [] do_syscall_64+0x10d/0x150
>> [241911.850750]  [] entry_SYSCALL64_slow_path+0x25/0x25
>> [241911.903564] ---[ end trace fae017546778f2b0 ]---
>> [241911.955332] [ cut here ]
>> [241912.006262] WARNING: CPU: 1 PID: 26664 at
>> fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs]
>> [241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT
>> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
>> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
>> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
>> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
>> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
>> [241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: GW
>> 4.7.0-rc6-29043-g8b8b08c #1
>> [241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
>> 02/18/2015
>> [241912.429395]   8808d104bca8 bd3d83cf
>> 
>> [241912.497080]   8808d104bcf8 bd085615
>> 8808d104bd08
>> [241912.565113]  16465a3410a8 0047a000 
>> 8808469e2088
>> [241912.634105] Call Trace:
>> [241912.702992]  [] dump_stack+0x63/0x84
>> [241912.773473]  [] __warn+0xe5/0x100
>> [241912.844339]  [] warn_slowpath_null+0x1d/0x20
>> [241912.916083]  []
>> btrfs_free_block_groups+0x40a/0x410 [btrfs]
>> [241912.989103]  [] close_ctree+0x15b/0x330 [btrfs]
>> [241913.062672]  [] btrfs_put_super+0x19/0x20 [btrfs]
>> [241913.136364]  [] generic_shutdown_super+0x6f/0x100
>> [241913.208701]  [] kill_anon_super+0x16/0x30
>> [241913.279194]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [241913.348065]  [] deactivate_locked_super+0x51/0x90
>> [241913.415082]  [] deactivate_super+0x4e/0x70
>> [241913.479841]  [] cleanup_mnt+0x43/0x90
>> [241913.543353]  [] __cleanup_mnt+0x12/0x20
>> [241913.605959]  [] task_work_run+0x81/0xb0
>> [241913.667542]  [] exit_t

Re: btrfs and systemd

2016-08-29 Thread Stefan Priebe - Profihost AG
Hi Qu,

Am 29.08.2016 um 03:48 schrieb Qu Wenruo:
> 
> 
> At 08/29/2016 04:15 AM, Stefan Priebe - Profihost AG wrote:
>> Hi,
>>
>> i'm trying to get my 60TB btrfs volume to mount with systemd at boot.
>> But this always fails with: "mounting timed out. Stopping." after 90s.
> 
> 60TB is quite large, and under most case it will already cause mount
> speed problem.
> 
> In our test environment, filling a fs with 16K small files to 2T (just
> 128K files)will already slow the mount process to 10s.
> 
> For larger fs, or more specifically, large extent tree, will slow the
> mount process obviously.
> 
> The root fix will need a rework of extent tree.
> AFAIK Josef is working on the rework.
> 
> So the btrfs fix will need some time.

thanks but i've no problem with the long mount time (in my case 6
minutes) i'm just wondering how to live with it with systemd. As it
always cancels the mount process after 90s and i see no fstab option to
change this.

Greets,
Stefan

> 
> Thanks,
> Qu
>>
>> I can't find any fstab setting for systemd to higher this timeout.
>> There's just  the x-systemd.device-timeout but this controls how long to
>> wait for the device and not for the mount command.
>>
>> Is there any solution for big btrfs volumes and systemd?
>>
>> Greets,
>> Stefan
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs and systemd

2016-08-28 Thread Stefan Priebe - Profihost AG
Hi,

i'm trying to get my 60TB btrfs volume to mount with systemd at boot.
But this always fails with: "mounting timed out. Stopping." after 90s.

I can't find any fstab setting for systemd to higher this timeout.
There's just  the x-systemd.device-timeout but this controls how long to
wait for the device and not for the mount command.

Is there any solution for big btrfs volumes and systemd?

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-08-14 Thread Stefan Priebe - Profihost AG
Hi Josef,

anything i could do or test? Results with a vanilla next branch are the
same.

Stefan

Am 11.08.2016 um 08:09 schrieb Stefan Priebe - Profihost AG:
> Hello,
> 
> the backtrace and info on umount looks the same:
> 
> [241910.341124] [ cut here ]
> [241910.379991] WARNING: CPU: 1 PID: 26664 at
> fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs]
> [241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT
> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
> [241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted
> 4.7.0-rc6-29043-g8b8b08c #1
> [241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
> 02/18/2015
> [241910.723716]   8808d104bca8 bd3d83cf
> 
> [241910.779309]   8808d104bcf8 bd085615
> 8808d104bd08
> [241910.835143]  16455a3410a8 0047a000 
> 8808469e2088
> [241910.891882] Call Trace:
> [241910.947624]  [] dump_stack+0x63/0x84
> [241911.003714]  [] __warn+0xe5/0x100
> [241911.060167]  [] warn_slowpath_null+0x1d/0x20
> [241911.117422]  []
> btrfs_free_block_groups+0x370/0x410 [btrfs]
> [241911.175975]  [] close_ctree+0x15b/0x330 [btrfs]
> [241911.235170]  [] btrfs_put_super+0x19/0x20 [btrfs]
> [241911.294638]  [] generic_shutdown_super+0x6f/0x100
> [241911.353005]  [] kill_anon_super+0x16/0x30
> [241911.409832]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [241911.466467]  [] deactivate_locked_super+0x51/0x90
> [241911.522602]  [] deactivate_super+0x4e/0x70
> [241911.577979]  [] cleanup_mnt+0x43/0x90
> [241911.633188]  [] __cleanup_mnt+0x12/0x20
> [241911.688146]  [] task_work_run+0x81/0xb0
> [241911.742740]  [] exit_to_usermode_loop+0x66/0x95
> [241911.797039]  [] do_syscall_64+0x10d/0x150
> [241911.850750]  [] entry_SYSCALL64_slow_path+0x25/0x25
> [241911.903564] ---[ end trace fae017546778f2b0 ]---
> [241911.955332] [ cut here ]
> [241912.006262] WARNING: CPU: 1 PID: 26664 at
> fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs]
> [241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT
> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
> [241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: GW
> 4.7.0-rc6-29043-g8b8b08c #1
> [241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
> 02/18/2015
> [241912.429395]   8808d104bca8 bd3d83cf
> 
> [241912.497080]   8808d104bcf8 bd085615
> 8808d104bd08
> [241912.565113]  16465a3410a8 0047a000 
> 8808469e2088
> [241912.634105] Call Trace:
> [241912.702992]  [] dump_stack+0x63/0x84
> [241912.773473]  [] __warn+0xe5/0x100
> [241912.844339]  [] warn_slowpath_null+0x1d/0x20
> [241912.916083]  []
> btrfs_free_block_groups+0x40a/0x410 [btrfs]
> [241912.989103]  [] close_ctree+0x15b/0x330 [btrfs]
> [241913.062672]  [] btrfs_put_super+0x19/0x20 [btrfs]
> [241913.136364]  [] generic_shutdown_super+0x6f/0x100
> [241913.208701]  [] kill_anon_super+0x16/0x30
> [241913.279194]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [241913.348065]  [] deactivate_locked_super+0x51/0x90
> [241913.415082]  [] deactivate_super+0x4e/0x70
> [241913.479841]  [] cleanup_mnt+0x43/0x90
> [241913.543353]  [] __cleanup_mnt+0x12/0x20
> [241913.605959]  [] task_work_run+0x81/0xb0
> [241913.667542]  [] exit_to_usermode_loop+0x66/0x95
> [241913.729612]  [] do_syscall_64+0x10d/0x150
> [241913.791203]  [] entry_SYSCALL64_slow_path+0x25/0x25
> [241913.852485] ---[ end trace fae017546778f2b1 ]---
> [241913.913638] [ cut here ]
> [241913.974871] WARNING: CPU: 1 PID: 26664 at
> fs/btrfs/extent-tree.c:10013 btrfs_free_block_groups+0x2ba/0x410 [btrfs]
> [241914.039315] Modules linked in: netconsole mpt3sas ipt_REJECT
> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
> iptable_filter ip_tables x_tables bonding coretemp loop usbhid e

Re: memory overflow or undeflow in free space tree / space_info?

2016-08-11 Thread Stefan Priebe - Profihost AG
[241914.607523]  271dbd3dac8c 88085184aac8 0038

[241914.681318] Call Trace:
[241914.754437]  [] dump_stack+0x63/0x84
[241914.828796]  [] __warn+0xe5/0x100
[241914.902953]  [] warn_slowpath_null+0x1d/0x20
[241914.977271]  []
btrfs_free_block_groups+0x2ba/0x410 [btrfs]
[241915.052041]  [] close_ctree+0x15b/0x330 [btrfs]
[241915.126282]  [] btrfs_put_super+0x19/0x20 [btrfs]
[241915.200758]  [] generic_shutdown_super+0x6f/0x100
[241915.273872]  [] kill_anon_super+0x16/0x30
[241915.345132]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
[241915.414703]  [] deactivate_locked_super+0x51/0x90
[241915.482488]  [] deactivate_super+0x4e/0x70
[241915.547994]  [] cleanup_mnt+0x43/0x90
[241915.611962]  [] __cleanup_mnt+0x12/0x20
[241915.674717]  [] task_work_run+0x81/0xb0
[241915.736398]  [] exit_to_usermode_loop+0x66/0x95
[241915.798592]  [] do_syscall_64+0x10d/0x150
[241915.860295]  [] entry_SYSCALL64_slow_path+0x25/0x25
[241915.921642] ---[ end trace fae017546778f2b2 ]---
[241915.982893] BTRFS: space_info 4 has 114577997824 free, is not full
[241916.045103] BTRFS: space_info total=307627032576, used=193048903680,
pinned=0, reserved=0, may_use=688537059328, readonly=131072

Greets,
Stefan

Am 10.08.2016 um 23:31 schrieb Stefan Priebe - Profihost AG:
> Hi Josef,
> 
> same again with chris next branch:
> 
> ERROR: error during balancing '/vmbackup/': No space left on device
> There may be more info in syslog - try dmesg | tail
> Dumping filters: flags 0x7, state 0x0, force is off
>   DATA (flags 0x2): balancing, usage=5
>   METADATA (flags 0x2): balancing, usage=5
>   SYSTEM (flags 0x2): balancing, usage=5
> 
> dmesg:
> [203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance
> 
> uname -r 4.7.0-rc6-29043-g8b8b08c
> 
> Greets,
> Stefan
> 
> Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG:
>> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
>>> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>>>> AG wrote:
>>>>>>> Dear list,
>>>>>>>
>>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>>>> 30TB).
>>>>>>>
>>>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>>>> systems.
>>>>>>
>>>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>>>> rather than the free space tree itself. I haven't debugged one of these
>>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>>>
>>>>> I should've asked, what sort of filesystem activity triggers this?
>>>>>
>>>>
>>>> Chris just fixed this I think, try his next branch from his git tree
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>>
>>> Thanks now running a 4.4 with those patches backported. If that still
>>> shows an error i will try that vanilla tree.
>>
>> OK this didn't work. I'll start / try using the linux-btrfs next branch
>> and look if this helps.
>>
>> Greets,
>> Stefan
>>
>>>
>>> Thanks!
>>>
>>> Stefan
>>>
>>>> and see if it still happens.  Thanks,
>>>>
>>>> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-08-10 Thread Stefan Priebe - Profihost AG
Hi Josef,

same again with chris next branch:

ERROR: error during balancing '/vmbackup/': No space left on device
There may be more info in syslog - try dmesg | tail
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=5
  METADATA (flags 0x2): balancing, usage=5
  SYSTEM (flags 0x2): balancing, usage=5

dmesg:
[203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance

uname -r 4.7.0-rc6-29043-g8b8b08c

Greets,
Stefan

Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG:
> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
>> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>>> AG wrote:
>>>>>> Dear list,
>>>>>>
>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>>> 30TB).
>>>>>>
>>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>>> systems.
>>>>>
>>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>>> rather than the free space tree itself. I haven't debugged one of these
>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>>
>>>> I should've asked, what sort of filesystem activity triggers this?
>>>>
>>>
>>> Chris just fixed this I think, try his next branch from his git tree
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>
>> Thanks now running a 4.4 with those patches backported. If that still
>> shows an error i will try that vanilla tree.
> 
> OK this didn't work. I'll start / try using the linux-btrfs next branch
> and look if this helps.
> 
> Greets,
> Stefan
> 
>>
>> Thanks!
>>
>> Stefan
>>
>>> and see if it still happens.  Thanks,
>>>
>>> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-08-08 Thread Stefan Priebe - Profihost AG
Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>> AG wrote:
>>>>> Dear list,
>>>>>
>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>> 30TB).
>>>>>
>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>> systems.
>>>>
>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>> rather than the free space tree itself. I haven't debugged one of these
>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>
>>> I should've asked, what sort of filesystem activity triggers this?
>>>
>>
>> Chris just fixed this I think, try his next branch from his git tree
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
> 
> Thanks now running a 4.4 with those patches backported. If that still
> shows an error i will try that vanilla tree.

OK this didn't work. I'll start / try using the linux-btrfs next branch
and look if this helps.

Greets,
Stefan

> 
> Thanks!
> 
> Stefan
> 
>> and see if it still happens.  Thanks,
>>
>> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-08-04 Thread Stefan Priebe - Profihost AG
Am 29.07.2016 um 23:03 schrieb Josef Bacik:
> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>> AG wrote:
>>>> Dear list,
>>>>
>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>> 30TB).
>>>>
>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>> (since commit [1]). Could someone please be so kind and help me
>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>> systems.
>>>
>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>> rather than the free space tree itself. I haven't debugged one of these
>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>
>> I should've asked, what sort of filesystem activity triggers this?
>>
> 
> Chris just fixed this I think, try his next branch from his git tree
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git

Thanks now running a 4.4 with those patches backported. If that still
shows an error i will try that vanilla tree.

Thanks!

Stefan

> and see if it still happens.  Thanks,
> 
> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-07-29 Thread Stefan Priebe - Profihost AG

Am 29.07.2016 um 21:14 schrieb Omar Sandoval:
> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
>>> Dear list,
>>>
>>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>>>
>>> In all cases i'm getting a trace like this one a space_info warning.
>>> (since commit [1]). Could someone please be so kind and help me
>>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
>>
>> Hm, so I think this indicates a bug in space accounting somewhere else
>> rather than the free space tree itself. I haven't debugged one of these
>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
> 
> I should've asked, what sort of filesystem activity triggers this?
> 

Sure.


The workload on the FS is basically:
- Write file1 (50GB - 500GB)

- cp --reflink=always file1 to file2
- apply changes to file2 (100MB - 5GB)

- cp --reflink=always file2 to file3
- apply changes to file3 (100MB - 5GB)

...

- delete file1

- cp --reflink=always file3 to file4
- apply changes to file4 (100MB - 5GB)

- delete file2

...

And this for around 300 files a day. btrfs balance with dusage=5 and
musage=5 is running daily sometimes in parallel to the workload above.

Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memory overflow or undeflow in free space tree / space_info?

2016-07-29 Thread Stefan Priebe - Profihost AG
Am 29.07.2016 um 21:11 schrieb Omar Sandoval:
> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
>> Dear list,
>>
>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>>
>> In all cases i'm getting a trace like this one a space_info warning.
>> (since commit [1]). Could someone please be so kind and help me
>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
> 
> Hm, so I think this indicates a bug in space accounting somewhere else
> rather than the free space tree itself. I haven't debugged one of these
> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.

Thanks.

>> [ cut here ]
>> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710
> 
> Do these line numbers match up with yours?
> 
>   5706static void release_global_block_rsv(struct btrfs_fs_info 
> *fs_info)
>   5707{
>   5708block_rsv_release_bytes(fs_info, 
> _info->global_block_rsv, NULL,
>   5709(u64)-1);
>   5710WARN_ON(fs_info->delalloc_block_rsv.size > 0);
>   5711WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);
>   5712WARN_ON(fs_info->trans_block_rsv.size > 0);
>   5713WARN_ON(fs_info->trans_block_rsv.reserved > 0);
>   5714WARN_ON(fs_info->chunk_block_rsv.size > 0);
>   5715WARN_ON(fs_info->chunk_block_rsv.reserved > 0);
>   5716WARN_ON(fs_info->delayed_block_rsv.size > 0);
>   5717WARN_ON(fs_info->delayed_block_rsv.reserved > 0);
>   5718}

Yes it does.

But the kernel i'm using is somewhat special i'm using a 4.4 kernel with
a patchset from holger (CC'ed). See here:
https://github.com/hhoffstaette/kernel-patches/tree/c9cce0933a40db84627241143b123210aee0fde6/4.4.15

>> btrfs_free_block_groups+0x35a/0x400 [btrfs]()
>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
>> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
>> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
>> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
>> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
>> pps_core
>> CPU: 5 PID: 26421 Comm: umount Tainted: GW  O4.4.15+43-ph #1
>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
>>   880ae8b47cd8 bd3c712f 
>>  c03ec603 880ae8b47d18 bd0837e7 0047a000
>>   8806016a1400 8808881d2088 8808881d2000
>> Call Trace:
>>  [] dump_stack+0x63/0x84
>>  [] warn_slowpath_common+0x97/0xe0
>>  [] warn_slowpath_null+0x1a/0x20
>>  [] btrfs_free_block_groups+0x35a/0x400 [btrfs]
>>  [] close_ctree+0x15b/0x330 [btrfs]
>>  [] btrfs_put_super+0x19/0x20 [btrfs]
>>  [] generic_shutdown_super+0x6f/0x100
>>  [] kill_anon_super+0x16/0x30
>>  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
>>  [] deactivate_locked_super+0x51/0x90
>>  [] deactivate_super+0x4e/0x70
>>  [] cleanup_mnt+0x43/0x90
>>  [] __cleanup_mnt+0x12/0x20
>>  [] task_work_run+0x7e/0xa0
>>  [] exit_to_usermode_loop+0x66/0x95
>>  [] syscall_return_slowpath+0xa6/0xf0
>>  [] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace bd985b05cc90617f ]---
>> [ cut here ]
>> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711
>> btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
>> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
>> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
>> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
>> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
>> pps_core
>> CPU: 5 PID: 26421 Comm: umount Tainted: GW  O4.4.15+43-ph #1
>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
>>   880ae8b47cd8 bd3c712f 
>>  c03ec603 880ae8b47d18 bd0837e7 0047a000
>>   8806016a1400 f

memory overflow or undeflow in free space tree / space_info?

2016-07-29 Thread Stefan Priebe - Profihost AG
Dear list,

i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).

In all cases i'm getting a trace like this one a space_info warning.
(since commit [1]). Could someone please be so kind and help me
debugging / fixing this bug? I'm using space_cache=v2 on all those systems.

[ cut here ]
WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710
btrfs_free_block_groups+0x35a/0x400 [btrfs]()
Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
pps_core
CPU: 5 PID: 26421 Comm: umount Tainted: GW  O4.4.15+43-ph #1
Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
  880ae8b47cd8 bd3c712f 
 c03ec603 880ae8b47d18 bd0837e7 0047a000
  8806016a1400 8808881d2088 8808881d2000
Call Trace:
 [] dump_stack+0x63/0x84
 [] warn_slowpath_common+0x97/0xe0
 [] warn_slowpath_null+0x1a/0x20
 [] btrfs_free_block_groups+0x35a/0x400 [btrfs]
 [] close_ctree+0x15b/0x330 [btrfs]
 [] btrfs_put_super+0x19/0x20 [btrfs]
 [] generic_shutdown_super+0x6f/0x100
 [] kill_anon_super+0x16/0x30
 [] btrfs_kill_super+0x1a/0xb0 [btrfs]
 [] deactivate_locked_super+0x51/0x90
 [] deactivate_super+0x4e/0x70
 [] cleanup_mnt+0x43/0x90
 [] __cleanup_mnt+0x12/0x20
 [] task_work_run+0x7e/0xa0
 [] exit_to_usermode_loop+0x66/0x95
 [] syscall_return_slowpath+0xa6/0xf0
 [] int_ret_from_sys_call+0x25/0x8f
---[ end trace bd985b05cc90617f ]---
[ cut here ]
WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711
btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
pps_core
CPU: 5 PID: 26421 Comm: umount Tainted: GW  O4.4.15+43-ph #1
Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
  880ae8b47cd8 bd3c712f 
 c03ec603 880ae8b47d18 bd0837e7 0047a000
  8806016a1400 8808881d2088 8808881d2000
Call Trace:
 [] dump_stack+0x63/0x84
 [] warn_slowpath_common+0x97/0xe0
 [] warn_slowpath_null+0x1a/0x20
 [] btrfs_free_block_groups+0x3f4/0x400 [btrfs]
 [] close_ctree+0x15b/0x330 [btrfs]
 [] btrfs_put_super+0x19/0x20 [btrfs]
 [] generic_shutdown_super+0x6f/0x100
 [] kill_anon_super+0x16/0x30
 [] btrfs_kill_super+0x1a/0xb0 [btrfs]
 [] deactivate_locked_super+0x51/0x90
 [] deactivate_super+0x4e/0x70
 [] cleanup_mnt+0x43/0x90
 [] __cleanup_mnt+0x12/0x20
 [] task_work_run+0x7e/0xa0
 [] exit_to_usermode_loop+0x66/0x95
 [] syscall_return_slowpath+0xa6/0xf0
 [] int_ret_from_sys_call+0x25/0x8f
---[ end trace bd985b05cc906180 ]---
[ cut here ]
WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:9990
btrfs_free_block_groups+0x2a4/0x400 [btrfs]()
Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
pps_core
CPU: 5 PID: 26421 Comm: umount Tainted: GW  O4.4.15+43-ph #1
Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
  880ae8b47cd8 bd3c712f 
 c03ec603 880ae8b47d18 bd0837e7 880c6aaa4528
 0038  8802fe8d8c88 8808881d2000
Call Trace:
 [] dump_stack+0x63/0x84
 [] warn_slowpath_common+0x97/0xe0
 [] warn_slowpath_null+0x1a/0x20
 [] btrfs_free_block_groups+0x2a4/0x400 [btrfs]
 [] close_ctree+0x15b/0x330 [btrfs]
 [] btrfs_put_super+0x19/0x20 [btrfs]
 [] generic_shutdown_super+0x6f/0x100
 [] kill_anon_super+0x16/0x30
 [] btrfs_kill_super+0x1a/0xb0 [btrfs]
 [] deactivate_locked_super+0x51/0x90
 [] deactivate_super+0x4e/0x70
 [] cleanup_mnt+0x43/0x90
 [] __cleanup_mnt+0x12/0x20
 [] task_work_run+0x7e/0xa0

Re: ENOSPC / no space on very large devices

2016-07-28 Thread Stefan Priebe - Profihost AG

Am 20.07.2016 um 09:35 schrieb Holger Hoffstätte:
> On 07/20/16 07:31, Stefan Priebe - Profihost AG wrote:
>> Hi list,
>>
>> while i didn't had the problem for some month i'm now getting ENOSPC on
>> a regular basis on one host.
> 
> Well, it's getting better. :)

Again the same problem.

> 
>> if i umount the volume i get traces (i already did a clear_cache 4 days
>> ago to recalculate the space_tree):
>>
>> [545031.675797] [ cut here ]
>> [545031.725166] WARNING: CPU: 1 PID: 17711 at
>> fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]()
> 
> This is "only" a warning, but as we can see below it indicates a real
> problem. The warning was added only recently to for-next by the patch called
> "Btrfs: warn_on for unaccounted spaces" [1], but I've had it in my tree
> forever. Never seen the warning myself.
> 
> (snip)
>> [545037.909700] BTRFS: space_info 4 has 18446743523026157568 free, is
>> not full
> 
> Wow, ~18.4 exabytes really is a lot of free space. :)
> So it looks like something underflowed the space_info and now things are
> confused for about ~550 GB. Unfortunately I have no good idea how to fix
> that. :(

umount triggered this one:
[983102.838217] [ cut here ]
[983102.864383] WARNING: CPU: 1 PID: 483 at fs/btrfs/extent-tree.c:5710
btrfs_free_block_groups+0x35a/0x400 [btrfs]()
[983102.894424] Modules linked in: netconsole xt_multiport
iptable_filter ip_tables x_tables 8021q garp bonding usbhid coretemp
loop xhci_pci ehci_pci xhci_hcd ehci_hcd i40e(O) sb_edac vxlan
ip6_udp_tunnel usbcore ipmi_si i2c_i801 shpchp usb_common udp_tunnel
edac_core ipmi_msghandler button btrfs dm_mod raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod igb
i2c_algo_bit i2c_core sg sd_mod ptp ahci libahci pps_core aacraid
[983103.043010] CPU: 1 PID: 483 Comm: umount Tainted: G   O
4.4.15+43-ph #1
[983103.084441] Hardware name: Supermicro Super Server/X10SRi-F, BIOS
2.0 12/17/2015
[983103.127289]   880074673cd8 ad3c712f

[983103.171816]  c0298603 880074673d18 ad0837e7
000b2000
[983103.216766]   88103bd7e600 881037c92088
881037c92000
[983103.262776] Call Trace:
[983103.308602]  [] dump_stack+0x63/0x84
[983103.355609]  [] warn_slowpath_common+0x97/0xe0
[983103.403528]  [] warn_slowpath_null+0x1a/0x20
[983103.451297]  []
btrfs_free_block_groups+0x35a/0x400 [btrfs]
[983103.500439]  [] close_ctree+0x15b/0x330 [btrfs]
[983103.548805]  [] btrfs_put_super+0x19/0x20 [btrfs]
[983103.597122]  [] generic_shutdown_super+0x6f/0x100
[983103.645398]  [] kill_anon_super+0x16/0x30
[983103.693384]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
[983103.742430]  [] deactivate_locked_super+0x51/0x90
[983103.791501]  [] deactivate_super+0x4e/0x70
[983103.839979]  [] cleanup_mnt+0x43/0x90
[983103.889050]  [] __cleanup_mnt+0x12/0x20
[983103.937756]  [] task_work_run+0x7e/0xa0
[983103.986032]  [] exit_to_usermode_loop+0x66/0x95
[983104.035214]  [] syscall_return_slowpath+0xa6/0xf0
[983104.084312]  [] int_ret_from_sys_call+0x25/0x8f
[983104.134098] ---[ end trace ca97a745adcb888f ]---
[983104.184540] [ cut here ]
[983104.235514] WARNING: CPU: 1 PID: 483 at fs/btrfs/extent-tree.c:5711
btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
[983104.290282] Modules linked in: netconsole xt_multiport
iptable_filter ip_tables x_tables 8021q garp bonding usbhid coretemp
loop xhci_pci ehci_pci xhci_hcd ehci_hcd i40e(O) sb_edac vxlan
ip6_udp_tunnel usbcore ipmi_si i2c_i801 shpchp usb_common udp_tunnel
edac_core ipmi_msghandler button btrfs dm_mod raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod igb
i2c_algo_bit i2c_core sg sd_mod ptp ahci libahci pps_core aacraid
[983104.536076] CPU: 1 PID: 483 Comm: umount Tainted: GW  O
4.4.15+43-ph #1
[983104.601962] Hardware name: Supermicro Super Server/X10SRi-F, BIOS
2.0 12/17/2015
[983104.669312]   880074673cd8 ad3c712f

[983104.738337]  c0298603 880074673d18 ad0837e7
000b2000
[983104.807874]   88103bd7e600 881037c92088
881037c92000
[983104.878415] Call Trace:
[983104.948803]  [] dump_stack+0x63/0x84
[983105.020781]  [] warn_slowpath_common+0x97/0xe0
[983105.093734]  [] warn_slowpath_null+0x1a/0x20
[983105.166888]  []
btrfs_free_block_groups+0x3f4/0x400 [btrfs]
[983105.241461]  [] close_ctree+0x15b/0x330 [btrfs]
[983105.316744]  [] btrfs_put_super+0x19/0x20 [btrfs]
[983105.392021]  [] generic_shutdown_super+0x6f/0x100
[983105.465852]  [] kill_anon_super+0x16/0x30
[983105.537829]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
[983105.608078]  [] deactivate_locked_super+0x51/0x90
[983105.676494]  [] deactivate_super

Re: ENOSPC / no space on very large devices

2016-07-28 Thread Stefan Priebe - Profihost AG
here we go...

Am 20.07.2016 um 08:31 schrieb Wang Xiaoguang:
> hello,
> 
> On 07/20/2016 01:31 PM, Stefan Priebe - Profihost AG wrote:
>> Hi list,
>>
>> while i didn't had the problem for some month i'm now getting ENOSPC on
>> a regular basis on one host.
>>
>> It would be great if someone can help me debugging this.
>>
>> Some basic informations:
>> # touch /vmbackup/abc
>> touch: cannot touch `/vmbackup/abc': No space left on device
> When touch operation failed, would you please change dir to
> /sys/fs/btrfs/UUID/allocation/data/ and show me these files' value.
> And also files in /sys/fs/btrfs/UUID/allocation/metadata. thanks.
> Here UUID is your real uuid :)

/sys/fs/btrfs/ebcb9a5e-d784-4e17-9cd0-bc67fe7b1ed6/allocation/data]#
grep -H '' *
bytes_may_use:0
bytes_pinned:0
bytes_reserved:0
bytes_used:6175380234240
disk_total:6641093181440
disk_used:6175380234240
flags:1
grep: single: Is a directory
total_bytes:6641093181440
total_bytes_pinned:726104035328

 /sys/fs/btrfs/ebcb9a5e-d784-4e17-9cd0-bc67fe7b1ed6/allocation/metadata]# grep 
-H '' *
bytes_may_use:2089625649152
bytes_pinned:0
bytes_reserved:0
bytes_used:36823187456
disk_total:95563022336
disk_used:73646374912
grep: dup: Is a directory
flags:4
total_bytes:47781511168
total_bytes_pinned:-16792829952

Greets,
Stefan

> 
> Regards,
> Xiaoguang Wang
> 
>> # df -h /vmbackup/
>> FilesystemSize  Used Avail Use% Mounted on
>> /dev/mapper/stripe0-vmbackup   37T   28T  8,5T  77% /vmbackup
>>
>> # btrfs filesystem df /vmbackup/
>> Data, single: total=27.87TiB, used=27.39TiB
>> System, DUP: total=8.00MiB, used=4.34MiB
>> Metadata, DUP: total=286.50GiB, used=199.91GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>> # btrfs filesystem show /vmbackup/
>> Label: none  uuid: c8c3abf7-8280-4baa-bb51-a8c599e48002
>>  Total devices 1 FS bytes used 27.59TiB
>>  devid1 size 36.38TiB used 28.43TiB path
>> /dev/mapper/stripe0-vmbackup
>>
>> # mount | grep vmbackup
>> /dev/mapper/stripe0-vmbackup on /vmbackup type btrfs
>> (rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,clear_cache,commit=300,subvolid=5,subvol=/)
>>
>>
>> dmesg is empty.
>>
>> if i umount the volume i get traces (i already did a clear_cache 4 days
>> ago to recalculate the space_tree):
>>
>> [545031.675797] [ cut here ]
>> [545031.725166] WARNING: CPU: 1 PID: 17711 at
>> fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]()
>> [545031.778329] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4
>> mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter
>> ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan
>> ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd
>> i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler
>> button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy
>> async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod
>> ahci ptp libahci megaraid_sas pps_core
>> [545032.081037] CPU: 1 PID: 17711 Comm: umount Tainted: G   O
>> 4.4.15+43-ph #1
>> [545032.145078] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
>> 02/18/2015
>> [545032.210238]   88010c40bcd8 bd3c712f
>> 
>> [545032.275650]  c03ec603 88010c40bd18 bd0837e7
>> 0047a000
>> [545032.341525]   88105e0ea400 881054a76088
>> 881054a76000
>> [545032.408500] Call Trace:
>> [545032.475272]  [] dump_stack+0x63/0x84
>> [545032.543620]  [] warn_slowpath_common+0x97/0xe0
>> [545032.612900]  [] warn_slowpath_null+0x1a/0x20
>> [545032.682026]  []
>> btrfs_free_block_groups+0x35a/0x400 [btrfs]
>> [545032.750297]  [] close_ctree+0x15b/0x330 [btrfs]
>> [545032.817085]  [] btrfs_put_super+0x19/0x20 [btrfs]
>> [545032.883439]  [] generic_shutdown_super+0x6f/0x100
>> [545032.949302]  [] kill_anon_super+0x16/0x30
>> [545033.014327]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [545033.079031]  [] deactivate_locked_super+0x51/0x90
>> [545033.143275]  [] deactivate_super+0x4e/0x70
>> [545033.206535]  [] cleanup_mnt+0x43/0x90
>> [545033.268842]  [] __cleanup_mnt+0x12/0x20
>> [545033.331629]  [] task_work_run+0x7e/0xa0
>> [545033.393350]  [] exit_to_usermode_loop+0x66/0x95
>> [545033.454685]  [] syscall_return_slowpath+0xa6/0xf0
>> [545033.515485]  [] int_ret_from_sys_call+0x25/0x8f
>> [545033.575890] ---[ end trace bd985b05cc90617c ]---
>> [545033.636708] -

ENOSPC / no space on very large devices

2016-07-19 Thread Stefan Priebe - Profihost AG
Hi list,

while i didn't had the problem for some month i'm now getting ENOSPC on
a regular basis on one host.

It would be great if someone can help me debugging this.

Some basic informations:
# touch /vmbackup/abc
touch: cannot touch `/vmbackup/abc': No space left on device

# df -h /vmbackup/
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/stripe0-vmbackup   37T   28T  8,5T  77% /vmbackup

# btrfs filesystem df /vmbackup/
Data, single: total=27.87TiB, used=27.39TiB
System, DUP: total=8.00MiB, used=4.34MiB
Metadata, DUP: total=286.50GiB, used=199.91GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

# btrfs filesystem show /vmbackup/
Label: none  uuid: c8c3abf7-8280-4baa-bb51-a8c599e48002
Total devices 1 FS bytes used 27.59TiB
devid1 size 36.38TiB used 28.43TiB path
/dev/mapper/stripe0-vmbackup

# mount | grep vmbackup
/dev/mapper/stripe0-vmbackup on /vmbackup type btrfs
(rw,noatime,compress-force=zlib,nossd,noacl,space_cache=v2,clear_cache,commit=300,subvolid=5,subvol=/)

dmesg is empty.

if i umount the volume i get traces (i already did a clear_cache 4 days
ago to recalculate the space_tree):

[545031.675797] [ cut here ]
[545031.725166] WARNING: CPU: 1 PID: 17711 at
fs/btrfs/extent-tree.c:5710 btrfs_free_block_groups+0x35a/0x400 [btrfs]()
[545031.778329] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4
mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter
ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan
ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd
i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler
button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy
async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod
ahci ptp libahci megaraid_sas pps_core
[545032.081037] CPU: 1 PID: 17711 Comm: umount Tainted: G   O
4.4.15+43-ph #1
[545032.145078] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
02/18/2015
[545032.210238]   88010c40bcd8 bd3c712f

[545032.275650]  c03ec603 88010c40bd18 bd0837e7
0047a000
[545032.341525]   88105e0ea400 881054a76088
881054a76000
[545032.408500] Call Trace:
[545032.475272]  [] dump_stack+0x63/0x84
[545032.543620]  [] warn_slowpath_common+0x97/0xe0
[545032.612900]  [] warn_slowpath_null+0x1a/0x20
[545032.682026]  []
btrfs_free_block_groups+0x35a/0x400 [btrfs]
[545032.750297]  [] close_ctree+0x15b/0x330 [btrfs]
[545032.817085]  [] btrfs_put_super+0x19/0x20 [btrfs]
[545032.883439]  [] generic_shutdown_super+0x6f/0x100
[545032.949302]  [] kill_anon_super+0x16/0x30
[545033.014327]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
[545033.079031]  [] deactivate_locked_super+0x51/0x90
[545033.143275]  [] deactivate_super+0x4e/0x70
[545033.206535]  [] cleanup_mnt+0x43/0x90
[545033.268842]  [] __cleanup_mnt+0x12/0x20
[545033.331629]  [] task_work_run+0x7e/0xa0
[545033.393350]  [] exit_to_usermode_loop+0x66/0x95
[545033.454685]  [] syscall_return_slowpath+0xa6/0xf0
[545033.515485]  [] int_ret_from_sys_call+0x25/0x8f
[545033.575890] ---[ end trace bd985b05cc90617c ]---
[545033.636708] [ cut here ]
[545033.696339] WARNING: CPU: 1 PID: 17711 at
fs/btrfs/extent-tree.c:5711 btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
[545033.758031] Modules linked in: netconsole ipt_REJECT nf_reject_ipv4
mpt3sas raid_class scsi_transport_sas xt_multiport iptable_filter
ip_tables x_tables 8021q garp bonding coretemp loop i40e(O) vxlan
ip6_udp_tunnel usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd
i2c_i801 i2c_core usbcore shpchp usb_common ipmi_si ipmi_msghandler
button btrfs dm_mod raid1 raid456 async_raid6_recov async_memcpy
async_pq async_xor async_tx xor raid6_pq md_mod ixgbe mdio sg sd_mod
ahci ptp libahci megaraid_sas pps_core
[545034.095188] CPU: 1 PID: 17711 Comm: umount Tainted: GW  O
4.4.15+43-ph #1
[545034.166070] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
02/18/2015
[545034.236259]   88010c40bcd8 bd3c712f

[545034.307690]  c03ec603 88010c40bd18 bd0837e7
0047a000
[545034.379596]   88105e0ea400 881054a76088
881054a76000
[545034.452542] Call Trace:
[545034.525286]  [] dump_stack+0x63/0x84
[545034.599643]  [] warn_slowpath_common+0x97/0xe0
[545034.674894]  [] warn_slowpath_null+0x1a/0x20
[545034.750338]  []
btrfs_free_block_groups+0x3f4/0x400 [btrfs]
[545034.826354]  [] close_ctree+0x15b/0x330 [btrfs]
[545034.900758]  [] btrfs_put_super+0x19/0x20 [btrfs]
[545034.973612]  [] generic_shutdown_super+0x6f/0x100
[545035.044589]  [] kill_anon_super+0x16/0x30
[545035.113505]  [] btrfs_kill_super+0x1a/0xb0 [btrfs]
[545035.180769]  [] deactivate_locked_super+0x51/0x90
[545035.246451]  [] deactivate_super+0x4e/0x70
[545035.311231]  [] cleanup_mnt+0x43/0x90
[545035.374958]  [] 

  1   2   >