Re: [4.8] btrfs heats my room with lock contention

2016-08-04 Thread Dave Chinner
On Thu, Aug 04, 2016 at 10:28:44AM -0400, Chris Mason wrote: > > > On 08/04/2016 02:41 AM, Dave Chinner wrote: > > > >Simple test. 8GB pmem device on a 16p machine: > > > ># mkfs.btrfs /dev/pmem1 > ># mount /dev/pmem1 /mnt/scratch > ># dbench -t 60 -D /mnt/scratch 16 > > > >And heat your room

[PATCH v2 0/3] Qgroup fix for dirty hack routines

2016-08-04 Thread Qu Wenruo
This patchset introduce 2 fixes for data extent owner hacks. One can be triggered by balance, another one can be trigged by log replay after power loss. Root cause are all similar: EXTENT_DATA owner is changed by dirty hacks, from swapping tree blocks containing EXTENT_DATA to manually update

[PATCH v2 1/3] btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent()

2016-08-04 Thread Qu Wenruo
Refactor btrfs_qgroup_insert_dirty_extent() function, to two functions: 1. _btrfs_qgroup_insert_dirty_extent() Almost the same with original code. For delayed_ref usage, which has delayed refs locked. Change the return value type to int, since caller never needs the pointer, but only

[PATCH v2 2/3] btrfs: relocation: Fix leaking qgroups numbers on data extents

2016-08-04 Thread Qu Wenruo
When balancing data extents, qgroup will leak all its numbers for relocated data extents. The relocation is done in the following steps for data extents: 1) Create data reloc tree and inode 2) Copy all data extents to data reloc tree And commit transaction 3) Create tree reloc tree(special

[PATCH v2 3/3] btrfs: qgroup: Fix qgroup incorrectness caused by log replay

2016-08-04 Thread Qu Wenruo
When doing log replay at mount time(after power loss), qgroup will leak numbers of replayed data extents. The cause is almost the same of balance. So fix it by manually informing qgroup for owner changed extents. The bug can be detected by btrfs/119 test case. Cc: Mark Fasheh

[PATCH v3] xfs: test attr_list_by_handle cursor iteration

2016-08-04 Thread Darrick J. Wong
Apparently the XFS attr_list_by_handle ioctl has never actually copied the cursor contents back to user space, which means that iteration has never worked. Add a test case for this and see "xfs: in _attrlist_by_handle, copy the cursor back to userspace". v2: Use BULKSTAT_SINGLE for less

Re: [PATCH] exportfs: be careful to only return expected errors.

2016-08-04 Thread NeilBrown
On Thu, Aug 04 2016, Christoph Hellwig wrote: > On Thu, Aug 04, 2016 at 10:19:06AM +1000, NeilBrown wrote: >> >> >> When nfsd calls fh_to_dentry, it expect ESTALE or ENOMEM as errors. >> In particular it can be tempting to return ENOENT, but this is not >> handled well by nfsd. >> >> Rather

Re: possible bug - wrong path in 'btrfs subvolume show' when snapshot is in path below subvolume.

2016-08-04 Thread Peter Holm
writing error. replace "gives no path to" with "same path as" /Peter Holm 2016-08-05 1:32 GMT+02:00, Peter Holm : > 'btrfs subvolumee show' gives no path to btrfs system root (volid=5) > when snapshot is in the folder of subvolume. > > Step to reproduce. > 1.btrfs

possible bug - wrong path in 'btrfs subvolume show' when snapshot is in path below subvolume.

2016-08-04 Thread Peter Holm
'btrfs subvolumee show' gives no path to btrfs system root (volid=5) when snapshot is in the folder of subvolume. Step to reproduce. 1.btrfs subvolume create xyz 2.btrfs subvolume snapshot xyz xyz/xyz 3.btrfs subvolume snapshot /xyz 4.btrfs subvolumme show xyz output . Snapshot(s)

Re: BTRFS: Transaction aborted (error -28)

2016-08-04 Thread Mordechay Kaganer
B.H. > On Fri, Jul 29, 2016 at 8:23 PM, Duncan <1i5t5.dun...@cox.net> wrote: >> So I'd recommend upgrading to the latest kernel 4.4 if you want to stay >> with the stable series, or 4.6 or 4.7 if you want current, and then (less >> important) upgrading the btrfs userspace as well. It's possible

Re: How to stress test raid6 on 122 disk array

2016-08-04 Thread Martin
Excellent. Thanks. In order to automate it, would it be ok if I dd some zeroes directly to the devices to corrupt them, or do need to physically take the disks out while running? The smallest disk of the 122 is 500GB. Is it possible to have btrfs see each disk as only e.g. 10GB? That way I can

Re: How to stress test raid6 on 122 disk array

2016-08-04 Thread Chris Murphy
On Thu, Aug 4, 2016 at 2:51 PM, Martin wrote: > Thanks for the benchmark tools and tips on where the issues might be. > > Is Fedora 24 rawhide preferred over ArchLinux? I'm not sure what Arch does any differently to their kernels from kernel.org kernels. But

Re: How to stress test raid6 on 122 disk array

2016-08-04 Thread Martin
Thanks for the benchmark tools and tips on where the issues might be. Is Fedora 24 rawhide preferred over ArchLinux? If I want to compile a mainline kernel. Are there anything I need to tune? When I do the tests, how do I log the info you would like to see, if I find a bug? On 4 August 2016

Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem

2016-08-04 Thread Chris Murphy
On Thu, Aug 4, 2016 at 10:53 AM, Lutz Vieweg wrote: > The amount of threads on "lost or unused free space" without resolutions > in the btrfs mailing list archive is really frightening. If these > symptoms commonly re-appear with no fix in sight, I'm afraid I'll have > to either

Re: [PATCH] exportfs: be careful to only return expected errors.

2016-08-04 Thread J. Bruce Fields
On Thu, Aug 04, 2016 at 05:47:19AM -0700, Christoph Hellwig wrote: > On Thu, Aug 04, 2016 at 10:19:06AM +1000, NeilBrown wrote: > > > > > > When nfsd calls fh_to_dentry, it expect ESTALE or ENOMEM as errors. > > In particular it can be tempting to return ENOENT, but this is not > > handled well

Re: How to stress test raid6 on 122 disk array

2016-08-04 Thread Chris Murphy
On Thu, Aug 4, 2016 at 1:05 PM, Austin S. Hemmelgarn wrote: >Fedora should be fine (they're good about staying up to > date), but if possible you should probably use Rawhide instead of a regular > release, as that will give you quite possibly one of the closest >

[GIT PULL] Btrfs

2016-08-04 Thread Chris Mason
Hi Linus, This is part two of my btrfs pull, which is some cleanups and a batch of fixes. Most of the code here is from Jeff Mahoney, making the pointers we pass around internally more consistent and less confusing overall. I noticed a small problem right before I sent this out yesterday, so

Re: How to stress test raid6 on 122 disk array

2016-08-04 Thread Austin S. Hemmelgarn
On 2016-08-04 13:43, Martin wrote: Hi, I would like to find rare raid6 bugs in btrfs, where I have the following hw: * 2x 8 core CPU * 128GB ram * 70 FC disk array (56x 500GB + 14x 1TB SATA disks) * 24 FC or 2x SAS disk array (1TB SAS disks) * 16 FC disk array (1TB SATA disks) * 12 SAS disk

Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem

2016-08-04 Thread Lutz Vieweg
Hi, I was today hit by what I think is probably the same bug: A btrfs on a close-to-4TB sized block device, only half filled to almost exactly 2 TB, suddenly says "no space left on device" upon any attempt to write to it. The filesystem was NOT automatically switched to read-only by the kernel,

Re: [PATCH 37/45] drivers: use req op accessor

2016-08-04 Thread Shaun Tancheff
On Thu, Aug 4, 2016 at 10:46 AM, Christoph Hellwig wrote: > On Wed, Aug 03, 2016 at 07:30:29PM -0500, Shaun Tancheff wrote: >> I think the translation in loop.c is suspicious here: >> >> "if use DIO && not (a flush_flag or discard_flag)" >> should translate to: >> "if

Re: [PATCH 37/45] drivers: use req op accessor

2016-08-04 Thread Christoph Hellwig
On Wed, Aug 03, 2016 at 07:30:29PM -0500, Shaun Tancheff wrote: > I think the translation in loop.c is suspicious here: > > "if use DIO && not (a flush_flag or discard_flag)" > should translate to: > "if use DIO && not ((a flush_flag) || op == discard)" > > But in the patch I read: >

Re: [4.8] btrfs heats my room with lock contention

2016-08-04 Thread Chris Mason
On 08/04/2016 02:41 AM, Dave Chinner wrote: Simple test. 8GB pmem device on a 16p machine: # mkfs.btrfs /dev/pmem1 # mount /dev/pmem1 /mnt/scratch # dbench -t 60 -D /mnt/scratch 16 And heat your room with the warm air rising from your CPUs. Top half of the btrfs profile looks like:

Re: [PATCH] exportfs: be careful to only return expected errors.

2016-08-04 Thread Christoph Hellwig
On Thu, Aug 04, 2016 at 10:19:06AM +1000, NeilBrown wrote: > > > When nfsd calls fh_to_dentry, it expect ESTALE or ENOMEM as errors. > In particular it can be tempting to return ENOENT, but this is not > handled well by nfsd. > > Rather than requiring strict adherence to error code code

Re: memory overflow or undeflow in free space tree / space_info?

2016-08-04 Thread Stefan Priebe - Profihost AG
Am 29.07.2016 um 23:03 schrieb Josef Bacik: > On 07/29/2016 03:14 PM, Omar Sandoval wrote: >> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote: >>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost >>> AG wrote: Dear list, i'm seeing btrfs no space

Re: Extents for a particular subvolume

2016-08-04 Thread Austin S. Hemmelgarn
On 2016-08-03 17:55, Graham Cobb wrote: On 03/08/16 21:37, Adam Borowski wrote: On Wed, Aug 03, 2016 at 08:56:01PM +0100, Graham Cobb wrote: Are there any btrfs commands (or APIs) to allow a script to create a list of all the extents referred to within a particular (mounted) subvolume? And is

[4.8] btrfs heats my room with lock contention

2016-08-04 Thread Dave Chinner
Simple test. 8GB pmem device on a 16p machine: # mkfs.btrfs /dev/pmem1 # mount /dev/pmem1 /mnt/scratch # dbench -t 60 -D /mnt/scratch 16 And heat your room with the warm air rising from your CPUs. Top half of the btrfs profile looks like: 36.71% [kernel] [k] _raw_spin_unlock_irqrestore