Re: btrfs and numa - needing drop_caches to keep speed up

2016-10-14 Thread Julian Taylor

On 10/14/2016 08:28 AM, Stefan Priebe - Profihost AG wrote:

Hello list,

while running the same workload on two machines (single xeon and a dual
xeon) both with 64GB RAM.

I need to run echo 3 >/proc/sys/vm/drop_caches every 15-30 minutes to
keep the speed as good as on the non numa system. I'm not sure whether
this is related to numa.

Is there any sysctl parameter to tune?

Tested with vanilla v4.8.1

Greets,
Stefan


hi,
why do you think this is related to btrfs?

The only known issue that has this type of workaround that I know of are 
transparent huge pages.
This is easy to diagnose but recording some kernel stacks during the 
problem with perf.
If there is very high system cpu usage in a spinlock called by 
compaction functions during page faults it is the synchronous memory 
defragmentation needed for thp.
Should that be the case the better workaround is disabling it in 
/sys/kernel/mm/transparent_hugepage/defrag


cheers,
Julian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommended why to use btrfs for production?

2016-06-03 Thread Julian Taylor

On 06/03/2016 03:31 PM, Martin wrote:

In general, avoid Ubuntu LTS versions when dealing with BTRFS, as well as
most enterprise distros, they all tend to back-port patches instead of using
newer kernels, which means it's functionally impossible to provide good
support for them here (because we can't know for sure what exactly they've
back-ported).  I'd suggest building your own kernel if possible, with Arch
Linux being a close second (they follow upstream very closely), followed by
Fedora and non-LTS Ubuntu.


Then I would build my own, if that is the preferred option.



Ubuntu also provides newer kernels for their LTS via the Hardware 
Enablement Stack:


https://wiki.ubuntu.com/Kernel/LTSEnablementStack

So if you can live with about 6 month time lag and shorter support for 
the non-lts versions of those kernels that is a good option.
As you can see 16.04 currently provides 4.4 and the next update will 
likely be 4.8.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: enospace regression in 4.4

2016-04-12 Thread Julian Taylor
On 12.04.2016 20:09, Henk Slager wrote:
> On Tue, Apr 12, 2016 at 5:52 PM, Julian Taylor
> <jtaylor.deb...@googlemail.com> wrote:
>> smaller testcase that shows the immediate enospc after fallocate -> rm,
>> though I don't know if it is really related to the full filesystem
>> bugging out as the balance does work if you wait a few seconds after the
>> balance.
>> But this sequence of commands did work in 4.2.
>>
>>  $ sudo btrfs fi show /dev/mapper/lvm-testing
>> Label: none  uuid: 25889ba9-a957-415a-83b0-e34a62cb3212
>> Total devices 1 FS bytes used 225.18MiB
>> devid1 size 5.00GiB used 788.00MiB path /dev/mapper/lvm-testing
>>
>>  $ fallocate -l 4.4G test.dat
>>  $ rm -f test.dat
>>  $ sudo btrfs fi balance start -dusage=0 .
>> ERROR: error during balancing '.': No space left on device
>> There may be more info in syslog - try dmesg | tail
> 
> It seems that kernel 4.4.6 waits longer with de-allocating empty
> chunks and the balance kicks in at a time when the 5 GiB is still
> completely filled with chunks. As balance needs uncallocated space (on
> device level, how much depends on profiles), this error can be
> expected.

hm ok, I'll put a sleep in the script then.
fallocate; rm; fallocate seems to work so its probably ok in normal usage.


> 
>> On 04/12/2016 12:24 PM, Julian Taylor wrote:
>>> hi,
>>> I have a system with two filesystems which are both affected by the
>>> notorious enospace bug when there is plenty of unallocated space
>>> available. The system is a raid0 on two 900 GiB disks and an iscsi
>>> single/dup 1.4TiB.
>>> To deal with the problem I use a cronjob that uses fallocate to give me
>>> an advance notice on the issue so I can apply the only workaround that
>>> works for me, which is shrink the fs to the minimum and grow it again.
>>> This has worked fine for a couple of month.
>>>
>>> I now updated from 4.2 to 4.4.6 and it appears my cronjob actually
>>> triggers an immediate enospc in the balance after removing the
>>> fallocated file and the shrink/resize workaround does not work anymore.
> 
> The filesystem itself is not resized AFAIU, correct?

btrfs resize -XG /mount
so resize filesystem but not the underlying device.

Actually the system just went into enospc again with unallocated free
even after the revert to 4.2 and the shrink trick doesn't want to work
anymore either ...
Though the 4.2 running now is not the same where the shrink workaround
work. I'll have to check the changelog to see if there are btrfs related
changes in it.


> 
> You could shrink a file-system by a few GiB's (without changing the
> size of the underlying device), so that once it really gets filled up
> and hits enospc, you resize to max again and delete files or snapshot
> or something. Of course no option for a 24/7 unattended system, but
> maybe for a client laptop as testing.
> 

that us basically what I have been doing, I used the cronjob to see when
the enospc issue occurred and then resize shrink to fix it. It was
relatively rare, I had to do it maybe every two month.

But now for some reason that trick doesn't work anymore either, I can
shrink it by 200G and resize it back to max and it still complains about
no free space. So now I'm at a loss on how to keep this system working.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: enospace regression in 4.4

2016-04-12 Thread Julian Taylor
smaller testcase that shows the immediate enospc after fallocate -> rm,
though I don't know if it is really related to the full filesystem
bugging out as the balance does work if you wait a few seconds after the
balance.
But this sequence of commands did work in 4.2.

 $ sudo btrfs fi show /dev/mapper/lvm-testing
Label: none  uuid: 25889ba9-a957-415a-83b0-e34a62cb3212
Total devices 1 FS bytes used 225.18MiB
devid1 size 5.00GiB used 788.00MiB path /dev/mapper/lvm-testing

 $ fallocate -l 4.4G test.dat
 $ rm -f test.dat
 $ sudo btrfs fi balance start -dusage=0 .
ERROR: error during balancing '.': No space left on device
There may be more info in syslog - try dmesg | tail


On 04/12/2016 12:24 PM, Julian Taylor wrote:
> hi,
> I have a system with two filesystems which are both affected by the
> notorious enospace bug when there is plenty of unallocated space
> available. The system is a raid0 on two 900 GiB disks and an iscsi
> single/dup 1.4TiB.
> To deal with the problem I use a cronjob that uses fallocate to give me
> an advance notice on the issue so I can apply the only workaround that
> works for me, which is shrink the fs to the minimum and grow it again.
> This has worked fine for a couple of month.
> 
> I now updated from 4.2 to 4.4.6 and it appears my cronjob actually
> triggers an immediate enospc in the balance after removing the
> fallocated file and the shrink/resize workaround does not work anymore.
> it is mounted with enospc_debug but that just says "2 enospc in
> balance". Nothing else useful in the log.
> 
> I had to revert back to 4.2 to get the system running again so it is
> currently not available for more testing, but I may be able to do more
> tests if required in future.
> 
> The cronjob does this once a day:
> 
> #!/bin/bash
> sync
> 
> check() {
>   date
>   mnt=$1
>   time btrfs fi balance start -mlimit=2 $mnt
>   btrfs fi balance start -dusage=5 $mnt
>   sync
>   freespace=$(df -B1 $mnt | tail -n 1 | awk '{print $4 -
> 50*1024*1024*1024}')
>   fallocate -l $freespace $mnt/falloc
>   /usr/sbin/filefrag $mnt/falloc
>   rm -f $mnt/falloc
>   btrfs fi balance start -dusage=0 $mnt
> 
>   time btrfs fi balance start -mlimit=2 $mnt
>   time btrfs fi balance start -dlimit=10 $mnt
>   date
> }
> 
> check /data
> check /data/nas
> 
> 
> btrfs info:
> 
> 
>  ~ $ btrfs --version
> btrfs-progs v4.4
> sagan5 ~ $ sudo btrfs fi show
> Label: none  uuid: e4aef349-7a56-4287-93b1-79233e016aae
>   Total devices 2 FS bytes used 898.18GiB
>   devid1 size 880.00GiB used 473.03GiB path /dev/mapper/data-linear1
>   devid2 size 880.00GiB used 473.03GiB path /dev/mapper/data-linear2
> 
> Label: none  uuid: 14040f9b-53c8-46cf-be6b-35de746c3153
>   Total devices 1 FS bytes used 557.19GiB
>   devid1 size 1.36TiB used 585.95GiB path /dev/sdd
> 
>  ~ $ sudo btrfs fi df /data
> Data, RAID0: total=938.00GiB, used=895.09GiB
> System, RAID1: total=32.00MiB, used=112.00KiB
> Metadata, RAID1: total=4.00GiB, used=3.10GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> sagan5 ~ $ sudo btrfs fi usage /data
> Overall:
> Device size: 1.72TiB
> Device allocated:  946.06GiB
> Device unallocated:813.94GiB
> Device missing:0.00B
> Used:  901.27GiB
> Free (estimated):  856.85GiB  (min: 449.88GiB)
> Data ratio: 1.00
> Metadata ratio: 2.00
> Global reserve:512.00MiB  (used: 0.00B)
> 
> Data,RAID0: Size:938.00GiB, Used:895.09GiB
>/dev/dm-1   469.00GiB
>/dev/mapper/data-linear1469.00GiB
> 
> Metadata,RAID1: Size:4.00GiB, Used:3.09GiB
>/dev/dm-1 4.00GiB
>/dev/mapper/data-linear1  4.00GiB
> 
> System,RAID1: Size:32.00MiB, Used:112.00KiB
>/dev/dm-132.00MiB
>/dev/mapper/data-linear1 32.00MiB
> 
> Unallocated:
>/dev/dm-1   406.97GiB
>/dev/mapper/data-linear1406.97GiB
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


enospace regression in 4.4

2016-04-12 Thread Julian Taylor
hi,
I have a system with two filesystems which are both affected by the
notorious enospace bug when there is plenty of unallocated space
available. The system is a raid0 on two 900 GiB disks and an iscsi
single/dup 1.4TiB.
To deal with the problem I use a cronjob that uses fallocate to give me
an advance notice on the issue so I can apply the only workaround that
works for me, which is shrink the fs to the minimum and grow it again.
This has worked fine for a couple of month.

I now updated from 4.2 to 4.4.6 and it appears my cronjob actually
triggers an immediate enospc in the balance after removing the
fallocated file and the shrink/resize workaround does not work anymore.
it is mounted with enospc_debug but that just says "2 enospc in
balance". Nothing else useful in the log.

I had to revert back to 4.2 to get the system running again so it is
currently not available for more testing, but I may be able to do more
tests if required in future.

The cronjob does this once a day:

#!/bin/bash
sync

check() {
  date
  mnt=$1
  time btrfs fi balance start -mlimit=2 $mnt
  btrfs fi balance start -dusage=5 $mnt
  sync
  freespace=$(df -B1 $mnt | tail -n 1 | awk '{print $4 -
50*1024*1024*1024}')
  fallocate -l $freespace $mnt/falloc
  /usr/sbin/filefrag $mnt/falloc
  rm -f $mnt/falloc
  btrfs fi balance start -dusage=0 $mnt

  time btrfs fi balance start -mlimit=2 $mnt
  time btrfs fi balance start -dlimit=10 $mnt
  date
}

check /data
check /data/nas


btrfs info:


 ~ $ btrfs --version
btrfs-progs v4.4
sagan5 ~ $ sudo btrfs fi show
Label: none  uuid: e4aef349-7a56-4287-93b1-79233e016aae
Total devices 2 FS bytes used 898.18GiB
devid1 size 880.00GiB used 473.03GiB path /dev/mapper/data-linear1
devid2 size 880.00GiB used 473.03GiB path /dev/mapper/data-linear2

Label: none  uuid: 14040f9b-53c8-46cf-be6b-35de746c3153
Total devices 1 FS bytes used 557.19GiB
devid1 size 1.36TiB used 585.95GiB path /dev/sdd

 ~ $ sudo btrfs fi df /data
Data, RAID0: total=938.00GiB, used=895.09GiB
System, RAID1: total=32.00MiB, used=112.00KiB
Metadata, RAID1: total=4.00GiB, used=3.10GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
sagan5 ~ $ sudo btrfs fi usage /data
Overall:
Device size:   1.72TiB
Device allocated:946.06GiB
Device unallocated:  813.94GiB
Device missing:  0.00B
Used:901.27GiB
Free (estimated):856.85GiB  (min: 449.88GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID0: Size:938.00GiB, Used:895.09GiB
   /dev/dm-1 469.00GiB
   /dev/mapper/data-linear1  469.00GiB

Metadata,RAID1: Size:4.00GiB, Used:3.09GiB
   /dev/dm-1   4.00GiB
   /dev/mapper/data-linear14.00GiB

System,RAID1: Size:32.00MiB, Used:112.00KiB
   /dev/dm-1  32.00MiB
   /dev/mapper/data-linear1   32.00MiB

Unallocated:
   /dev/dm-1 406.97GiB
   /dev/mapper/data-linear1  406.97GiB
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html