Re: [PATCH 04/10] Btrfs: extent map selftest: buffered write vs dio read

2017-12-21 Thread Nikolay Borisov
On 22.12.2017 00:42, Liu Bo wrote: > This test case simulates the racy situation of buffered write vs dio > read, and see if btrfs_get_extent() would return -EEXIST. Isn't mixing dio/buffered IO on the same file (range?) considered dangerous in any case? > > Signed-off-by: Liu Bo

Re: [PATCH 03/10] Btrfs: add extent map selftests

2017-12-21 Thread Nikolay Borisov
On 22.12.2017 00:42, Liu Bo wrote: > We've observed that btrfs_get_extent() and merge_extent_mapping() could > return -EEXIST in several cases, and they are caused by some racy > condition, e.g dio read vs dio write, which makes the problem very tricky > to reproduce. > > This adds extent map

Re: [PATCH 01/10] Btrfs: add helper for em merge logic

2017-12-21 Thread Nikolay Borisov
On 22.12.2017 00:42, Liu Bo wrote: > This is a prepare work for the following extent map selftest, which > runs tests against em merge logic. > > Signed-off-by: Liu Bo > --- > fs/btrfs/ctree.h | 2 ++ > fs/btrfs/inode.c | 101 >

[PATCH v2 01/10] btrfs: qgroup: Split meta rsv type into meta_prealloc and meta_pertrans

2017-12-21 Thread Qu Wenruo
Btrfs uses 2 different method to resever metadata qgroup space. 1) Reserve at btrfs_start_transaction() time This is quite straightforward, caller will use the trans handler allocated to modify b-trees. In this case, reserved metadata should be kept until qgroup numbers are updated.

[PATCH v2 07/10] btrfs: qgroup: Update trace events for metadata reservation

2017-12-21 Thread Qu Wenruo
Now trace_qgroup_meta_reserve() will have extra type parameter. And introduce two new trace events: 1) trace_qgroup_meta_free_all_pertrans() For btrfs_qgroup_free_meta_all_pertrans() 2) trace_qgroup_meta_convert() For btrfs_qgroup_convert_reserved_meta() Signed-off-by: Qu Wenruo

[PATCH v2 10/10] btrfs: qgroup: Use independent and accurate per inode qgroup rsv

2017-12-21 Thread Qu Wenruo
Unlike reservation calculation used in inode rsv for metadata, qgroup doesn't really need to care things like csum size or extent usage for whole tree COW. Qgroup care more about net change of extent usage. That's to say, if we're going to insert one file extent, it will mostly find its place in

[PATCH v2 08/10] Revert "btrfs: qgroups: Retry after commit on getting EDQUOT"

2017-12-21 Thread Qu Wenruo
This reverts commit 48a89bc4f2ceab87bc858a8eb189636b09c846a7. The idea to commit transaction and free some space after hitting qgroup limit is good, although the problem is it will easily cause deadlocks. One deadlock example is caused by trying to flush data while still holding it: Call Trace:

[PATCH v2 05/10] btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and item

2017-12-21 Thread Qu Wenruo
Quite similar for delalloc, some modification to delayed-inode and delayed-item reservation. Also needs extra parameter for release case to distinguish normal release and error release. Signed-off-by: Qu Wenruo --- fs/btrfs/delayed-inode.c | 56

[PATCH v2 00/10] Use split qgroup rsv type

2017-12-21 Thread Qu Wenruo
[Overall] The previous rework on qgroup reservation system put a lot of effort on data, which works quite fine. But it takes less focus on metadata reservation, causing some problem like metadata reservation underflow and noisy kernel warning. This patchset will try to address the remaining

[PATCH v2 04/10] btrfs: qgroup: Use separate meta reservation type for delalloc

2017-12-21 Thread Qu Wenruo
Before this patch, btrfs qgroup is mixing per-transcation meta rsv with preallocated meta rsv, making it quite easy to underflow qgroup meta reservation. Since we have the new qgroup meta rsv types, apply it to delalloc reservation. Now for delalloc, most of its reserved space will use

[PATCH v2 02/10] btrfs: qgroup: Don't use root->qgroup_meta_rsv for qgroup

2017-12-21 Thread Qu Wenruo
Since qgroup has seperate metadata reservation types now, we can completely get rid of the old root->qgroup_meta_rsv, which mostly acts as current META_PERTRANS reservation type. Signed-off-by: Qu Wenruo --- fs/btrfs/ctree.h | 3 --- fs/btrfs/disk-io.c | 1 -

[PATCH v2 09/10] btrfs: qgroup: Commit transaction in advance to reduce early EDQUOT

2017-12-21 Thread Qu Wenruo
Unlike previous method to try commit transaction inside qgroup_reserve(), this time we will try to commit transaction using fs_info->transaction_kthread to avoid nested transaction and no need to worry about lock context. Since it's an asynchronous function call and we won't wait transaction

[PATCH v2 06/10] btrfs: qgroup: Use root->qgroup_meta_rsv_* to record qgroup meta reserved space

2017-12-21 Thread Qu Wenruo
For quota disabled->enable case, it's possible that at reservation time quota was not enabled so no byte was really reserved, while at release time, quota is enabled so we will try to release some bytes we didn't really own. Such situation can cause metadata reserveation underflow, for both

[PATCH v2 03/10] btrfs: qgroup: Introduce function to convert META_PREALLOC into META_PERTRANS

2017-12-21 Thread Qu Wenruo
For meta_prealloc reservation user, after btrfs_join_transaction() caller will modify tree so part (or even all) meta_prealloc reservation should be converted to meta_pertrans until transaction commit time. This patch introduce a new function, btrfs_qgroup_convert_reserved_meta() to do this for

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2017-12-21 Thread jianchao.wang
Sorry for my non-detailed description. On 12/21/2017 09:50 PM, Tejun Heo wrote: > Hello, > > On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: >> It's worrying that even though the blk_mark_rq_complete() here is intended >> to synchronize with >> timeout path, but it indeed give

[PATCH] btrfs: qgroup: remove unused label 'retry'

2017-12-21 Thread Colin King
From: Colin Ian King Label 'retry' is not used, remove it. Cleans up a clang build warning: warning: label ‘retry’ defined but not used [-Wunused-label] Fixes: b283738ab0ad ("Revert "btrfs: qgroups: Retry after commit on getting EDQUOT"") Signed-off-by: Colin Ian

Re: [PATCH] btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes

2017-12-21 Thread Liu Bo
On Sat, Dec 16, 2017 at 08:42:51AM +0200, Nikolay Borisov wrote: > > > On 15.12.2017 21:58, Chris Mason wrote: > > refcounts have a generic implementation and an asm optimized one. The > > generic version has extra debugging to make sure that once a refcount > > goes to zero, refcount_inc won't

[PATCH 02/10] Btrfs: move extent map specific code to extent_map.c

2017-12-21 Thread Liu Bo
These helpers are extent map specific, this moves them to extent_map.c. Signed-off-by: Liu Bo --- fs/btrfs/ctree.h | 2 - fs/btrfs/extent_map.c | 118 ++ fs/btrfs/extent_map.h | 2 + fs/btrfs/inode.c | 117

[PATCH 09/10] Btrfs: add tracepoint for em's EEXIST case

2017-12-21 Thread Liu Bo
This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the subtle bugs around merge_extent_mapping. Signed-off-by: Liu Bo --- fs/btrfs/extent_map.c| 1 + include/trace/events/btrfs.h | 35 +++ 2 files changed, 36

[PATCH 10/10] Btrfs: noinline merge_extent_mapping

2017-12-21 Thread Liu Bo
In order to debug subtle bugs around merge_extent_mapping(), perf probe can be used to check the arguments, but sometimes merge_extent_mapping() got inlined by compiler and couldn't be probed. This is adding noinline attribute to merge_extent_mapping(). Signed-off-by: Liu Bo

[PATCH 04/10] Btrfs: extent map selftest: buffered write vs dio read

2017-12-21 Thread Liu Bo
This test case simulates the racy situation of buffered write vs dio read, and see if btrfs_get_extent() would return -EEXIST. Signed-off-by: Liu Bo --- fs/btrfs/tests/extent-map-tests.c | 73 +++ 1 file changed, 73 insertions(+) diff

[PATCH 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping

2017-12-21 Thread Liu Bo
This is a subtle case, so in order to understand the problem, it'd be good to know the content of existing and em when any error occurs. Signed-off-by: Liu Bo --- fs/btrfs/extent_map.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git

[PATCH 03/10] Btrfs: add extent map selftests

2017-12-21 Thread Liu Bo
We've observed that btrfs_get_extent() and merge_extent_mapping() could return -EEXIST in several cases, and they are caused by some racy condition, e.g dio read vs dio write, which makes the problem very tricky to reproduce. This adds extent map selftests in order to simulate those racy

[PATCH 06/10] Btrfs: fix incorrect block_len in merge_extent_mapping

2017-12-21 Thread Liu Bo
%block_len could be checked on deciding if two em are mergable. merge_extent_mapping() has only added the front pad if the front part of em gets truncated, but it's possible that the end part gets truncated. For both compressed extent and inline extent, em->block_len is not adjusted accordingly,

[PATCH 01/10] Btrfs: add helper for em merge logic

2017-12-21 Thread Liu Bo
This is a prepare work for the following extent map selftest, which runs tests against em merge logic. Signed-off-by: Liu Bo --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 101 ++- 2 files changed, 58 insertions(+), 45

[PATCH 00/10] bugfixes and regression tests of btrfs_get_extent

2017-12-21 Thread Liu Bo
Although commit e6c4efd87ab0 ("btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map") fixed up the negetive em->len, it has introduced several regressions, several has been fixed by commit 32be3a1ac6d0 ("btrfs: Fix the wrong condition judgment about subset extent

[PATCH 05/10] Btrfs: extent map selftest: dio write vs dio read

2017-12-21 Thread Liu Bo
This test case simulates the racy situation of dio write vs dio read, and see if btrfs_get_extent() would return -EEXIST. Signed-off-by: Liu Bo --- fs/btrfs/tests/extent-map-tests.c | 88 +++ 1 file changed, 88 insertions(+) diff --git

[PATCH 07/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent

2017-12-21 Thread Liu Bo
This fixes a corner case that is caused by a race of dio write vs dio read/write. dio write: [0, 32k) -> [0, 8k) + [8k, 32k) dio read/write: While get_extent() with [0, 4k), [0, 8k) is found as existing em, even though start == existing->start, em is [0, 32k), extent_map_end(em) >

Btrfs blocked by too many delayed refs

2017-12-21 Thread Martin Raiber
Hi, I have the problem that too many delayed refs block a btrfs storage. I have one thread that does work: [] io_schedule+0x16/0x40 [] wait_on_page_bit+0x116/0x150 [] read_extent_buffer_pages+0x1c5/0x290 [] btree_read_extent_buffer_pages+0x9d/0x100 [] read_tree_block+0x32/0x50 []

Re: Btrfs allow compression on NoDataCow files? (AFAIK Not, but it does)

2017-12-21 Thread Chris Mason
On 12/20/2017 03:59 PM, Timofey Titovets wrote: How reproduce: touch test_file chattr +C test_file dd if=/dev/zero of=test_file bs=1M count=1 btrfs fi def -vrczlib test_file filefrag -v test_file test_file Filesystem type is: 9123683e File size of test_file is 1048576 (256 blocks of 4096 bytes)

Re: [PATCH 02/14] btrfs: qgroup: Introduce helpers to update and access new qgroup rsv

2017-12-21 Thread Nikolay Borisov
On 12.12.2017 09:34, Qu Wenruo wrote: > Introduce helpers to: > > 1) Get total reserved space >For limit calculation > 2) Add/release reserved space for given type >With underflow detection and warning > 3) Add/release reserved space according to child qgroup > > Signed-off-by: Qu

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2017-12-21 Thread Tejun Heo
Hello, On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: > It's worrying that even though the blk_mark_rq_complete() here is intended to > synchronize with > timeout path, but it indeed give the blk_mq_complete_request() the capability > to exclude with > itself. Maybe this

Re: Distress Call Please don't ignore

2017-12-21 Thread Sandra Younes
Good Day, Forgive my indignation if this message comes to you as a surprise and may offend your personality for contacting you without your prior consent and writing through this channel. I came across your name and contact on the course of my personal searching when i was searching for a

Re: Unexpected raid1 behaviour

2017-12-21 Thread Austin S. Hemmelgarn
On 2017-12-21 06:44, Andrei Borzenkov wrote: On Tue, Dec 19, 2017 at 11:47 PM, Austin S. Hemmelgarn wrote: On 2017-12-19 15:41, Tomasz Pala wrote: On Tue, Dec 19, 2017 at 12:35:20 -0700, Chris Murphy wrote: with a read only file system. Another reason is the kernel

Re: [PATCH v4 72/73] xfs: Convert mru cache to XArray

2017-12-21 Thread Knut Omang
Joe Perches writes: > On Tue, 2017-12-12 at 08:43 +1100, Dave Chinner wrote: >> On Sat, Dec 09, 2017 at 09:00:18AM -0800, Joe Perches wrote: >> > On Sat, 2017-12-09 at 09:36 +1100, Dave Chinner wrote: >> > > 1. Using lockdep_set_novalidate_class() for anything other >> > >

Re: Unexpected raid1 behaviour

2017-12-21 Thread Andrei Borzenkov
On Wed, Dec 20, 2017 at 11:07 PM, Chris Murphy wrote: > > YaST doesn't have Btrfs raid1 or raid10 options; and also won't do > encrypted root with Btrfs either because YaST enforces LVM to do LUKS > encryption for some weird reason; and it also enforces NOT putting >

Re: [PATCH v3 19/19] fs: handle inode->i_version more efficiently

2017-12-21 Thread Jan Kara
On Thu 21-12-17 06:25:55, Jeff Layton wrote: > Got it, I think. How about this (sorry for the unrelated deltas here): > > [PATCH] SQUASH: add memory barriers around i_version accesses Yep, this looks good to me. Honza > >

Re: [PATCH v3 19/19] fs: handle inode->i_version more efficiently

2017-12-21 Thread Jeff Layton
On Wed, 2017-12-20 at 17:41 +0100, Jan Kara wrote: > On Wed 20-12-17 09:03:06, Jeff Layton wrote: > > On Tue, 2017-12-19 at 09:07 +1100, Dave Chinner wrote: > > > On Mon, Dec 18, 2017 at 02:35:20PM -0500, Jeff Layton wrote: > > > > [PATCH] SQUASH: add memory barriers around i_version accesses > >

Re: Cannot balance, ENOSPC errors 4.14.2 vanilla kernel

2017-12-21 Thread Qu Wenruo
On 2017年12月21日 15:56, Adam Bahe wrote: > Alright, I have rebuilt kernel 4.14.8 and added the line of code you > gave me. The kernel is installed and I have a full balance running. > Right off the bat one thing I noticed is that the last time I ran a > full balance, balance status showed