[PATCH] btrfs: qgroup: Fix inconsistent IS_ERR and PTR_ERR

2019-01-30 Thread Gustavo A. R. Silva
Fix inconsistent IS_ERR and PTR_ERR in btrfs_qgroup_trace_subtree_after_cow The proper pointer to be passed as argument is reloc_eb. This bug was detected with the help of Coccinelle. Fixes: 2b35a512e9cf ("btrfs: qgroup: Use delayed subtree rescan for balance") Signed-off-by: Gustavo A. R. Silva

Re: [PATCH RFC 2/2] btrfs: Introduce free dev extent hint to speed up chunk allocation

2019-01-30 Thread Qu Wenruo
> [ENHANCEMENT] > This patch will introduce btrfs_device::hint_free_dev_extent member to > give some hint for chunk allocator to find free dev extents. > > The hint itself is pretty simple, only tells where the first free slot > could possibly be. > > It is not 100% correct, unlike free space cac

Re: [PATCH v4 04/12] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up

2019-01-30 Thread Qu Wenruo
On 2019/1/30 下午11:19, David Sterba wrote: > On Fri, Jan 25, 2019 at 01:09:17PM +0800, Qu Wenruo wrote: >> +static int __must_check flush_write_bio(struct extent_page_data *epd) >> { >> -if (epd->bio) { >> -int ret; >> +int ret = 0; >> >> +if (epd->bio) { >>

Re: [PATCH v4 03/12] btrfs: disk-io: Show the timing of corrupted tree block explicitly

2019-01-30 Thread Qu Wenruo
On 2019/1/30 下午10:59, Nikolay Borisov wrote: > > > On 30.01.19 г. 16:57 ч., David Sterba wrote: >> On Fri, Jan 25, 2019 at 01:09:16PM +0800, Qu Wenruo wrote: >>> Just add one extra line to show when the corruption is detected. >>> Currently only read time detection is possible. >>> >>> Signed-

Re: [PATCH v4 1/3] btrfs: scrub: fix circular locking dependency warning

2019-01-30 Thread Anand Jain
On 1/30/19 10:07 PM, David Sterba wrote: On Wed, Jan 30, 2019 at 02:45:00PM +0800, Anand Jain wrote: v3->v4: Fix list corruption as reported by btrfs/073 by David. [1] https://patchwork.kernel.org/patch/10705741/ Which I was able to reproduce with an instrument

Re: [PATCH 09/11] btrfs: change set_level() to bound the level passed in

2019-01-30 Thread Dennis Zhou
On Tue, Jan 29, 2019 at 10:14:18AM +0200, Nikolay Borisov wrote: > > > On 28.01.19 г. 23:24 ч., Dennis Zhou wrote: > > Currently, the only user of set_level() is zlib which sets an internal > > workspace parameter. As level is now plumbed into get_workspace(), this > > can be handled there rather

Re: [PATCH 00/11] btrfs: add zstd compression level support

2019-01-30 Thread Dennis Zhou
Hi David, On Tue, Jan 29, 2019 at 06:18:30PM +0100, David Sterba wrote: > On Mon, Jan 28, 2019 at 04:24:26PM -0500, Dennis Zhou wrote: > > As mentioned above, a requirement that differs zstd from zlib is that > > higher levels of compression require more memory. To manage this, each > > compressio

Re: [PATCH] Btrfs: fix deadlock when allocating tree block during leaf/node split

2019-01-30 Thread David Sterba
On Fri, Jan 25, 2019 at 11:48:51AM +, fdman...@kernel.org wrote: > From: Filipe Manana > > When splitting a leaf or node from one of the trees that are modified when > flushing pending block groups (extent, chunk, device and free space trees), > we need to allocate a new tree block, which in

Re: [PATCH] btrfs: Output ENOSPC debug info in inc_block_group_ro()

2019-01-30 Thread David Sterba
On Wed, Jan 30, 2019 at 01:07:51PM +0800, Qu Wenruo wrote: > Since inc_block_group_ro() would return -ENOSPC, outputting debug info > for enospc_debug mount option would be helpful to debug some balance > false ENOSPC report. Sure, added to misc-next.

Re: [PATCH] fs/btrfs: On error always free subvol_name in btrfs_mount

2019-01-30 Thread David Sterba
On Wed, Jan 30, 2019 at 07:54:12AM -0600, Eric W. Biederman wrote: > > The subvol_name is allocated in btrfs_parse_subvol_options and is > consumed and freed in mount_subvol. Add a free to the error paths that > don't call mount_subvol so that it is guaranteed that subvol_name is > freed when an

Re: [PATCH] btrfs: qgroup: Remove duplicated trace points for qgroup_rsv_add/release()

2019-01-30 Thread David Sterba
On Tue, Nov 13, 2018 at 03:05:08PM +0800, Qu Wenruo wrote: > Inside qgroup_rsv_add/release(), we have trace events > trace_qgroup_update_reserve() to catch reserved space update. > > However we still have two manual trace_qgroup_update_reserve() calls > just outside these functions. > > Remove th

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Hans van Kranenburg
On 1/30/19 5:38 PM, Hans van Kranenburg wrote: > On 1/30/19 4:26 PM, Christoph Anton Mitterer wrote: >> On Wed, 2019-01-30 at 07:58 -0500, Austin S. Hemmelgarn wrote: >>> Running dm-integrity without a journal is roughly equivalent to >>> using >>> the nobarrier mount option (the journal is used t

Re: [PATCH 6/8] btrfs: loop in inode_rsv_refill

2019-01-30 Thread David Sterba
On Mon, Dec 03, 2018 at 10:24:57AM -0500, Josef Bacik wrote: > With severe fragmentation we can end up with our inode rsv size being > huge during writeout, which would cause us to need to make very large > metadata reservations. However we may not actually need that much once > writeout is comple

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Hans van Kranenburg
On 1/30/19 4:26 PM, Christoph Anton Mitterer wrote: > On Wed, 2019-01-30 at 07:58 -0500, Austin S. Hemmelgarn wrote: >> Running dm-integrity without a journal is roughly equivalent to >> using >> the nobarrier mount option (the journal is used to provide the same >> guarantees that barriers do).

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Christoph Anton Mitterer
On Wed, 2019-01-30 at 11:00 -0500, Austin S. Hemmelgarn wrote: > Running dm-integrity on a device which doesn't support barriers > without > a journal is risky, because the journal can help mitigate the issues > arising from the lack of barrier support. Does it? Isn't it then suffering from the

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Austin S. Hemmelgarn
On 2019-01-30 10:26, Christoph Anton Mitterer wrote: On Wed, 2019-01-30 at 07:58 -0500, Austin S. Hemmelgarn wrote: Running dm-integrity without a journal is roughly equivalent to using the nobarrier mount option (the journal is used to provide the same guarantees that barriers do). IOW, don't

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Christoph Anton Mitterer
On Wed, 2019-01-30 at 07:58 -0500, Austin S. Hemmelgarn wrote: > Running dm-integrity without a journal is roughly equivalent to > using > the nobarrier mount option (the journal is used to provide the same > guarantees that barriers do). IOW, don't do this unless you are > willing > to lose th

Re: [PATCH v4 04/12] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up

2019-01-30 Thread David Sterba
On Fri, Jan 25, 2019 at 01:09:17PM +0800, Qu Wenruo wrote: > +static int __must_check flush_write_bio(struct extent_page_data *epd) > { > - if (epd->bio) { > - int ret; > + int ret = 0; > > + if (epd->bio) { > ret = submit_one_bio(epd->bio, 0, 0); > -

Re: [PATCH v4 04/12] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up

2019-01-30 Thread David Sterba
On Fri, Jan 25, 2019 at 01:09:17PM +0800, Qu Wenruo wrote: > We have a BUG_ON() in flush_write_bio() to handle the return value of > submit_one_bio(). > > Move the BUG_ON() one level up to all its callers. > > No functional change, just to make later BUG_ON() cleanup more obvious. > > Signed-off

Re: [PATCH v4 03/12] btrfs: disk-io: Show the timing of corrupted tree block explicitly

2019-01-30 Thread Nikolay Borisov
On 30.01.19 г. 16:57 ч., David Sterba wrote: > On Fri, Jan 25, 2019 at 01:09:16PM +0800, Qu Wenruo wrote: >> Just add one extra line to show when the corruption is detected. >> Currently only read time detection is possible. >> >> Signed-off-by: Qu Wenruo >> Reviewed-by: Nikolay Borisov >> ---

Re: [PATCH v4 03/12] btrfs: disk-io: Show the timing of corrupted tree block explicitly

2019-01-30 Thread David Sterba
On Fri, Jan 25, 2019 at 01:09:16PM +0800, Qu Wenruo wrote: > Just add one extra line to show when the corruption is detected. > Currently only read time detection is possible. > > Signed-off-by: Qu Wenruo > Reviewed-by: Nikolay Borisov > --- > fs/btrfs/disk-io.c | 2 ++ > 1 file changed, 2 inse

[PATCH 00/15] Improvements to fitrim

2019-01-30 Thread Nikolay Borisov
Here's a series that spruces up btrfs' trim implementation. The main goal is to optimise trim of freespace so that when a range is trimmed once and not allocated, subsequent trims will skip it, thus improving performance. First 3 patches are misc cleanups which are mostly independent. Of them,

[PATCH 04/15] btrfs: combine device update operations during transaction commit

2019-01-30 Thread Nikolay Borisov
From: Jeff Mahoney We currently overload the pending_chunks list to handle updating btrfs_device->commit_bytes used. We don't actually care about the extent mapping or even the device mapping for the chunk - we just need the device, and we can end up processing it multiple times. The fs_devices

[PATCH 07/15] btrfs: Populate ->orig_block_len during read_one_chunk

2019-01-30 Thread Nikolay Borisov
Chunks read from disk currently don't get their ->orig_block_len member set, in contrast when a new chunk is allocated, the respective extent_map's ->orig_block_len is assigned the size of the stripe of this chunk. Let's apply the same strategy for chunks which are read from disk, not only does thi

[PATCH 05/15] btrfs: Handle pending/pinned chunks before blockgroup relocation during device shrink

2019-01-30 Thread Nikolay Borisov
During device shrink pinned/pending chunks (i.e those which have been deleted/created respectively, in the current transaction and haven't touched disk) need to be accounted when doing device shrink. Presently this happens after the main relocation loop in btrfs_shrink_device, which could lead to m

[PATCH 06/15] btrfs: Rename and export clear_btree_io_tree

2019-01-30 Thread Nikolay Borisov
This function is going to be used to clear out the device extent allocation information. Give it a more generic name and export it. This is in preparation to replacing the pending/pinned chunk lists with an extent tree. No functional changes. Signed-off-by: Nikolay Borisov --- fs/btrfs/extent_io

[PATCH 09/15] btrfs: replace pending/pinned chunks lists with io tree

2019-01-30 Thread Nikolay Borisov
From: Jeff Mahoney The pending chunks list contains chunks that are allocated in the current transaction but haven't been created yet. The pinned chunks list contains chunks that are being released in the current transaction. Both describe chunks that are not reflected on disk as in use but are u

[PATCH 12/15] btrfs: Optimize unallocated chunks discard

2019-01-30 Thread Nikolay Borisov
Currently unallocated chunks are always trimmed. For example 2 consecutive trims on large storage would trim freespace twice irrespective of whether the space was actually allocated or not between those trims. Optimise this behavior by exploiting the newly introduced alloc_state tree of btrfs_devi

[PATCH 01/15] btrfs: Honour FITRIM range constraints during free space trim

2019-01-30 Thread Nikolay Borisov
Up until know trimming the freespace was done irrespective of what the arguments of the FITRIM ioctl were. For example fstrim's -o/-l arguments will be entirely ignored. Fix it by correctly handling those paramter. This requires breaking if the found freespace extent is after the end of the passed

[PATCH 10/15] btrfs: Remove 'trans' argument from find_free_dev_extent(_start)

2019-01-30 Thread Nikolay Borisov
Now that those function no longer require a handle to transaction to inspect pending/pinned chunks the argument can be removed. At the same time also remove any surrounding code which acquired the handle. Signed-off-by: Nikolay Borisov --- fs/btrfs/extent-tree.c | 36 +++-

[PATCH 08/15] btrfs: Introduce new bits for device allocation tree

2019-01-30 Thread Nikolay Borisov
Rather than hijacking the existing defines let's just define new bits, with more descriptive names. Instead of using yet more (currently at 18) bits for the new flags, use the fact those flags will be specific to the device allocation tree so define them using existing EXTENT_* flags. Signed-off-b

[PATCH 03/15] btrfs: Remove EXTENT_FIRST_DELALLOC bit

2019-01-30 Thread Nikolay Borisov
With the refactoring introduced in 8b62f87bad9c ("Btrfs: reworki outstanding_extents") this flag became unused. Remove it and renumber the following flags accordingly. No functional changes. Signed-off-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 2 -- fs/btrfs/extent_io.h | 15 +++---

[PATCH 14/15] btrfs: Implement find_first_clear_extent_bit

2019-01-30 Thread Nikolay Borisov
This function is very similar to find_first_extent_bit except that it locates the first contiguous span of space which does not have bits set. It's intended use is in the freespace trimming code. Signed-off-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 78 +++

[PATCH 11/15] btrfs: Factor out in_range macro

2019-01-30 Thread Nikolay Borisov
This is used in more than one places so let's factor it out in ctree.h. No functional changes. Signed-off-by: Nikolay Borisov --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/extent-tree.c | 1 - fs/btrfs/volumes.c | 1 - 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctr

[PATCH 13/15] btrfs: Fix gross misnaming

2019-01-30 Thread Nikolay Borisov
The variables and function parameters of __etree_search which pertain to prev/next are grossly misnamed. Namely, prev_ret holds the next state and not the previous. Similarly, next_ret actually holds the previous extent state relating to the offset we are interested in. Fix this by renaming the var

[PATCH 02/15] btrfs: Make WARN_ON in a canonical form

2019-01-30 Thread Nikolay Borisov
There is no point in using a construct like 'if (!condition) WARN_ON(1)'. Use WARN_ON(!condition) directly. No functional changes. Signed-off-by: Nikolay Borisov --- fs/btrfs/extent-tree.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/bt

[PATCH 15/15] btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit

2019-01-30 Thread Nikolay Borisov
Instead of always calling the allocator to search for a free extent, that satisfies the input criteria, switch btrfs_trim_free_extents to using find_first_clear_extent_bit. With this change it's no longer necessary to read the device tree in order to figure out holes in the devices. Now the code a

Re: [PATCH] fs/btrfs: On error always free subvol_name in btrfs_mount

2019-01-30 Thread Nikolay Borisov
On 30.01.19 г. 15:54 ч., Eric W. Biederman wrote: > > The subvol_name is allocated in btrfs_parse_subvol_options and is > consumed and freed in mount_subvol. Add a free to the error paths that > don't call mount_subvol so that it is guaranteed that subvol_name is > freed when an error happens.

Re: [PATCH v4 1/3] btrfs: scrub: fix circular locking dependency warning

2019-01-30 Thread David Sterba
On Wed, Jan 30, 2019 at 02:45:00PM +0800, Anand Jain wrote: > v3->v4: Fix list corruption as reported by btrfs/073 by David. >[1] >https://patchwork.kernel.org/patch/10705741/ > Which I was able to reproduce with an instrumented kernel but not with > btrfs/073. >

[PATCH] fs/btrfs: On error always free subvol_name in btrfs_mount

2019-01-30 Thread Eric W. Biederman
The subvol_name is allocated in btrfs_parse_subvol_options and is consumed and freed in mount_subvol. Add a free to the error paths that don't call mount_subvol so that it is guaranteed that subvol_name is freed when an error happens. Fixes: 312c89fbca06 ("btrfs: cleanup btrfs_mount() using btr

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Austin S. Hemmelgarn
On 2019-01-29 18:15, Hans van Kranenburg wrote: Hi, Thought experiment time... I have an HP z820 workstation here (with ECC memory, yay!) and 4x250G 10k SAS disks (and some spare disks). It's donated hardware, and I'm going to use it to replace the current server in the office of a non-profit o

Re: [PATCH 1/2] btrfs: Don't search devid for every verify_one_dev_extent() call

2019-01-30 Thread Nikolay Borisov
On 30.01.19 г. 9:39 ч., Qu Wenruo wrote: > verify_one_dev_extent() will call btrfs_find_device() for each dev > extent, this waste some CPU time just searching the devices list. > > Move the search one level up, into the btrfs_verify_dev_extents(), so > for each device we only call btrfs_find_d

Re: dm-integrity + mdadm + btrfs = no journal?

2019-01-30 Thread Roman Mamedov
On Tue, 29 Jan 2019 23:15:18 + Hans van Kranenburg wrote: > So, what I was thinking of is: > > * Use dm-integrity on partitions on the individual disks > * Use mdadm RAID10 on top (which is then able to repair bitrot) > * Use LVM on top > * Etc... You never explicitly say what's the whole i