Re: [PATCH] btrfs: Avoid getting stuck during cyclic writebacks

2019-10-08 Thread Tejun Heo
Hello, On Tue, Oct 08, 2019 at 04:23:22PM +0200, David Sterba wrote: > > 1. There is a single file which has accumulated enough dirty pages to > >trigger balance_dirty_pages() and the writer appending to the file > >with a series of short writes. > > > > 2. bdp kicks in, wakes up backgrou

[PATCH] btrfs: Avoid getting stuck during cyclic writebacks

2019-10-03 Thread Tejun Heo
ne_index past the current page being processed. Note that this problem exists in other writepages too. Signed-off-by: Tejun Heo Cc: sta...@vger.kernel.org --- fs/btrfs/extent_io.c | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/

Re: [PATCHSET v3 btrfs/for-next] btrfs: fix cgroup writeback support

2019-09-06 Thread Tejun Heo
Hello, David. On Thu, Sep 05, 2019 at 01:59:37PM +0200, David Sterba wrote: > On Fri, Jul 26, 2019 at 05:13:21PM +0200, David Sterba wrote: > > On Wed, Jul 10, 2019 at 12:28:13PM -0700, Tejun Heo wrote: > > > Hello, > > > > > > This patchset contains only the

[PATCH 3/5] Btrfs: only associate the locked page with one async_cow struct

2019-07-10 Thread Tejun Heo
From: Chris Mason The btrfs writepages function collects a large range of pages flagged for delayed allocation, and then sends them down through the COW code for processing. When compression is on, we allocate one async_cow structure for every 512K, and then run those pages through the compressi

[PATCHSET v3 btrfs/for-next] btrfs: fix cgroup writeback support

2019-07-10 Thread Tejun Heo
Hello, This patchset contains only the btrfs part of the following patchset. [1] [PATCHSET v2 btrfs/for-next] blkcg, btrfs: fix cgroup writeback support The block part has already been applied to https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/ for-linus with some na

[PATCH 1/5] Btrfs: stop using btrfs_schedule_bio()

2019-07-10 Thread Tejun Heo
From: Chris Mason btrfs_schedule_bio() hands IO off to a helper thread to do the actual submit_bio() call. This has been used to make sure async crc and compression helpers don't get stuck on IO submission. To maintain good performance, over time the IO submission threads duplicated some IO sch

[PATCH 4/5] Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios

2019-07-10 Thread Tejun Heo
Chris Mason Modified-and-reviewed-by: Tejun Heo Reviewed-by: Josef Bacik --- fs/btrfs/compression.c | 8 +++- fs/btrfs/compression.h | 3 ++- fs/btrfs/disk-io.c | 6 ++ fs/btrfs/extent_io.c | 3 +++ fs/btrfs/inode.c | 31 --- 5 files change

[PATCH 5/5] Btrfs: extent_write_locked_range() should attach inode->i_wb

2019-07-10 Thread Tejun Heo
From: Chris Mason extent_write_locked_range() is used when we're falling back to buffered IO from inside of compression. It allocates its own wbc and should associate it with the inode's i_wb to make sure the IO goes down from the correct cgroup. Signed-off-by: Chris Mason Reviewed-by: Josef B

[PATCH 2/5] Btrfs: delete the entire async bio submission framework

2019-07-10 Thread Tejun Heo
From: Chris Mason Now that we're not using btrfs_schedule_bio() anymore, delete all the code that supported it. Signed-off-by: Chris Mason Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 1 - fs/btrfs/disk-io.c | 13 +-- fs/btrfs/super.c | 1 - fs/btrfs/volumes.c | 209 --

[PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-09 Thread Tejun Heo
Hello, Changes from [v4] - Comments added. Patch description updated. Changes from [v3] - Rebased on top of for-4.16/block. - Integrated Jens's hctx_[un]lock() factoring patch and refreshed the patches accordingly. - Added comment explaining the use of hctx_lock() instead of rcu_read_loc

[PATCH 1/8] blk-mq: move hctx lock/unlock into a helper

2018-01-09 Thread Tejun Heo
Signed-off-by: Tejun Heo --- block/blk-mq.c | 66 -- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 111e1aa..ddc9261 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -557,6 +557,22

[PATCH 2/8] blk-mq: protect completion path with RCU

2018-01-09 Thread Tejun Heo
Currently, blk-mq protects only the issue path with RCU. This patch puts the completion path under the same RCU protection. This will be used to synchronize issue/completion against timeout by later patches, which will also add the comments. Signed-off-by: Tejun Heo --- block/blk-mq.c | 5

[PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2018-01-09 Thread Tejun Heo
n top of hctx_lock() refactoring patch. - Added comment explaining the use of hctx_lock() in completion path. v5: - Added comments requested by Bart. - Note the addition of BLK_EH_RESET_TIMER race condition in the commit message. Signed-off-by: Tejun Heo Cc: "jianchao.wang&q

[PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path

2018-01-09 Thread Tejun Heo
ion around ->deadline update as requested by Bart. Signed-off-by: Tejun Heo Cc: Asai Thambi SP Cc: Stefan Haberland Cc: Jan Hoeppner Cc: Bart Van Assche --- block/blk-mq.c | 2 +- block/blk-mq.h | 2 -- block/blk-timeout.c | 13 + 3 files changed, 10 insertions(+), 7 delet

[PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2018-01-09 Thread Tejun Heo
. Signed-off-by: Tejun Heo --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 052fee5..51e9704 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -95,8 +95,7 @@ static void blk_mq_check_inflight(struct blk_mq_hw_ctx

[PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-09 Thread Tejun Heo
timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo Cc: "jianchao.wang" --- block/blk-mq.c | 15 +++ block/blk

[PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED

2018-01-09 Thread Tejun Heo
. REQ_ATOM_STARTED no longer has any users left and is removed. Signed-off-by: Tejun Heo --- block/blk-mq-debugfs.c | 4 +--- block/blk-mq.c | 37 - block/blk-mq.h | 1 + block/blk.h| 1 - 4 files changed, 10 insertions(+), 33 deletions

[PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu

2018-01-09 Thread Tejun Heo
The RCU protection has been expanded to cover both queueing and completion paths making ->queue_rq_srcu a misnomer. Rename it to ->srcu as suggested by Bart. Signed-off-by: Tejun Heo Cc: Bart Van Assche --- block/blk-mq.c | 14 +++--- include/linux/blk-mq.h | 2 +- 2

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Tejun Heo
Hello, On Tue, Jan 09, 2018 at 11:08:04AM +0800, jianchao.wang wrote: > > But what'd prevent the completion reinitializing the request and then > > the actual completion path coming in and completing the request again? > > blk_mark_rq_complete() will gate and ensure there will be only one > __blk

[PATCH 1/8] blk-mq: move hctx lock/unlock into a helper

2018-01-08 Thread Tejun Heo
Signed-off-by: Tejun Heo --- block/blk-mq.c | 66 -- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 111e1aa..ddc9261 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -557,6 +557,22

[PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2018-01-08 Thread Tejun Heo
tly being set in complete_request instead of free_request. Fixed. v4: - Rebased on top of hctx_lock() refactoring patch. - Added comment explaining the use of hctx_lock() in completion path. Signed-off-by: Tejun Heo Cc: "jianchao.wang" Cc: Peter Zijlstra Cc: Chr

[PATCH 2/8] blk-mq: protect completion path with RCU

2018-01-08 Thread Tejun Heo
Currently, blk-mq protects only the issue path with RCU. This patch puts the completion path under the same RCU protection. This will be used to synchronize issue/completion against timeout by later patches, which will also add the comments. Signed-off-by: Tejun Heo --- block/blk-mq.c | 5

[PATCHSET v4] blk-mq: reimplement timeout handling

2018-01-08 Thread Tejun Heo
Hello, Changes from [v3] - Rebased on top of for-4.16/block. - Integrated Jens's hctx_[un]lock() factoring patch and refreshed the patches accordingly. - Added comment explaining the use of hctx_lock() instead of rcu_read_lock() in completion path. Changes from [v2] - Possible extended lo

[PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path

2018-01-08 Thread Tejun Heo
short while, even when the caller owns the request. AFAICS, SCSI and ATA should be fine with that and I think mtip32xx and dasd should be safe but not completely sure. It'd be great if people who know the drivers take a look. Signed-off-by: Tejun Heo Cc: Asai Thambi SP Cc: Stefan Haberlan

[PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2018-01-08 Thread Tejun Heo
. Signed-off-by: Tejun Heo --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 6587f0c..41bfd27 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -95,8 +95,7 @@ static void blk_mq_check_inflight(struct blk_mq_hw_ctx

[PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED

2018-01-08 Thread Tejun Heo
. REQ_ATOM_STARTED no longer has any users left and is removed. Signed-off-by: Tejun Heo --- block/blk-mq-debugfs.c | 4 +--- block/blk-mq.c | 37 - block/blk-mq.h | 1 + block/blk.h| 1 - 4 files changed, 10 insertions(+), 33 deletions

[PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Tejun Heo
timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo Cc: "jianchao.wang" --- block/blk-mq.c | 15 +++ block/blk

[PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu

2018-01-08 Thread Tejun Heo
The RCU protection has been expanded to cover both queueing and completion paths making ->queue_rq_srcu a misnomer. Rename it to ->srcu as suggested by Bart. Signed-off-by: Tejun Heo Cc: Bart Van Assche --- block/blk-mq.c | 14 +++--- include/linux/blk-mq.h | 2 +- 2

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Tejun Heo
Hello, Jianchao. On Fri, Dec 22, 2017 at 12:02:20PM +0800, jianchao.wang wrote: > > On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: > >> It's worrying that even though the blk_mark_rq_complete() here is > >> intended to synchronize with timeout path, but it indeed give the > >> blk_

Re: [PATCH 1/7] blk-mq: protect completion path with RCU

2018-01-08 Thread Tejun Heo
Hello, Christoph. On Fri, Dec 29, 2017 at 02:04:18AM -0800, Christoph Hellwig wrote: > Why do you need the srcu protection? The completion path can never > sleep. > > If there is a good reason to keep it please add commment, and > make the srcu variant a separate function only used by drivers th

Re: [PATCHSET v3] blk-mq: reimplement timeout handling

2018-01-08 Thread Tejun Heo
On Fri, Dec 29, 2017 at 02:02:39AM -0800, Christoph Hellwig wrote: > This seems to miss the linux-block list once again. Please include > it in the next resend. Sorry about that. Copy/pasted from the older thread without thinking. Thanks. -- tejun -- To unsubscribe from this list: send the li

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2017-12-21 Thread Tejun Heo
Hello, On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: > It's worrying that even though the blk_mark_rq_complete() here is intended to > synchronize with > timeout path, but it indeed give the blk_mq_complete_request() the capability > to exclude with > itself. Maybe this capabil

[PATCH 2/7] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2017-12-16 Thread Tejun Heo
tly being set in complete_request instead of free_request. Fixed. Signed-off-by: Tejun Heo Cc: "jianchao.wang" Cc: Peter Zijlstra --- block/blk-core.c | 2 + block/blk-mq.c | 220 + block/blk-mq.h | 46 +

[PATCHSET v3] blk-mq: reimplement timeout handling

2017-12-16 Thread Tejun Heo
Hello, Changes from [v2] - Possible extended looping around seqcount and u64_stat_sync fixed. - Misplaced MQ_RQ_IDLE state setting fixed. - RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout multiple times. - s/queue_rq_src/srcu/ patch added. - Other misc changes. Changes from

[PATCH 1/7] blk-mq: protect completion path with RCU

2017-12-16 Thread Tejun Heo
Currently, blk-mq protects only the issue path with RCU. This patch puts the completion path under the same RCU protection. This will be used to synchronize issue/completion against timeout by later patches, which will also add the comments. Signed-off-by: Tejun Heo --- block/blk-mq.c | 16

[PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2017-12-16 Thread Tejun Heo
timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo Cc: "jianchao.wang" --- block/blk-mq.c | 18 -- block/blk

[PATCH 7/7] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu

2017-12-16 Thread Tejun Heo
The RCU protection has been expanded to cover both queueing and completion paths making ->queue_rq_srcu a misnomer. Rename it to ->srcu as suggested by Bart. Signed-off-by: Tejun Heo Cc: Bart Van Assche --- block/blk-mq.c | 22 +++--- include/linux/blk-mq.h | 2

[PATCH 3/7] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2017-12-16 Thread Tejun Heo
. Signed-off-by: Tejun Heo --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index abd5d01..643a38d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -95,8 +95,7 @@ static void blk_mq_check_inflight(struct blk_mq_hw_ctx

[PATCH 4/7] blk-mq: make blk_abort_request() trigger timeout path

2017-12-16 Thread Tejun Heo
short while, even when the caller owns the request. AFAICS, SCSI and ATA should be fine with that and I think mtip32xx and dasd should be safe but not completely sure. It'd be great if people who know the drivers take a look. Signed-off-by: Tejun Heo Cc: Asai Thambi SP Cc: Stefan Haberlan

[PATCH 6/7] blk-mq: remove REQ_ATOM_STARTED

2017-12-16 Thread Tejun Heo
. REQ_ATOM_STARTED no longer has any users left and is removed. Signed-off-by: Tejun Heo --- block/blk-mq-debugfs.c | 4 +--- block/blk-mq.c | 37 - block/blk-mq.h | 1 + block/blk.h| 1 - 4 files changed, 10 insertions(+), 33 deletions

Re: [PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-11-29 Thread Tejun Heo
On Wed, Nov 29, 2017 at 09:03:30AM -0800, Tejun Heo wrote: > Hello, > > On Wed, Nov 29, 2017 at 05:56:08PM +0100, Jan Kara wrote: > > What has happened with this patch set? > > No idea. cc'ing Chris directly. Chris, if the patchset looks good, > can you please rout

Re: [PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-11-29 Thread Tejun Heo
Hello, On Wed, Nov 29, 2017 at 05:56:08PM +0100, Jan Kara wrote: > What has happened with this patch set? No idea. cc'ing Chris directly. Chris, if the patchset looks good, can you please route them through the btrfs tree? Thanks. -- tejun -- To unsubscribe from this list: send the line "uns

[PATCH v3 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-12 Thread Tejun Heo
. Signed-off-by: Tejun Heo Reviewed-by: Liu Bo Cc: David Sterba Cc: Chris Mason Cc: Josef Bacik --- fs/btrfs/check-integrity.c |2 +- fs/btrfs/disk-io.c |4 fs/btrfs/ioctl.c |4 3 files changed, 9 insertions(+), 1 deletion(-) --- a/fs/btrfs/check

Re: [PATCH v2 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-12 Thread Tejun Heo
On Wed, Oct 11, 2017 at 07:07:23PM +0200, David Sterba wrote: > The comment is useful, but the condition will be always true, so I don't > see the point. > > /* >* The btree_inode will be always in the root cgroup. The cgroup >* writeback can be enabled on regular inodes sele

[PATCH v2 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-10 Thread Tejun Heo
h_blkcg_css() call. Signed-off-by: Tejun Heo Cc: Chris Mason Cc: Josef Bacik --- fs/btrfs/check-integrity.c | 2 +- fs/btrfs/disk-io.c | 4 fs/btrfs/ioctl.c | 4 +++- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check

[PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-10-10 Thread Tejun Heo
Hello, Changes from the last version are * blkcg_root_css exported to fix build breakage on modular btrfs. * Use ext4_should_journal_data() test instead of EXT4_MOUNT_JOURNAL_DATA. * Separated out create_bh_bio() and used it to implement submit_bh_blkcg_css() as suggested by Jan. btrfs has

[PATCH 3/5] buffer_head: separate out create_bh_bio() from submit_bh_wbc()

2017-10-10 Thread Tejun Heo
handling into submit_bh_wbc() and similarly this will make adding more submit_bh variants straight-forward. This patch is pure refactoring and doesn't cause any functional changes. Signed-off-by: Tejun Heo Suggested-by: Jan Kara --- fs/buffer.c | 30 ++ 1 file change

[PATCH 2/5] cgroup, writeback: replace SB_I_CGROUPWB with per-inode S_CGROUPWB

2017-10-10 Thread Tejun Heo
avior change is intended. v2: Use ext4_should_journal_data() as suggested by Jan. Signed-off-by: Tejun Heo Reviewed-by: Jan Kara Cc: Jens Axboe Cc: Chris Mason Cc: Josef Bacik Cc: linux-btrfs@vger.kernel.org Cc: "Theodore Ts'o" Cc: Andreas Dilger Cc: linux-e...@vger.kernel.org

[PATCH 4/5] cgroup, buffer_head: implement submit_bh_blkcg_css()

2017-10-10 Thread Tejun Heo
Implement submit_bh_blkcg_css() which will be used to override cgroup membership on specific buffer_heads. v2: Reimplemented using create_bh_bio() as suggested by Jan. Signed-off-by: Tejun Heo Cc: Jan Kara Cc: Jens Axboe --- fs/buffer.c | 12 include/linux

[PATCH 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-10 Thread Tejun Heo
don't call the function during init; however, this serves as documentation and prevents possible future mistakes. If this isn't desirable, please feel free to drop the section. Signed-off-by: Tejun Heo Cc: Chris Mason Cc: Josef Bacik --- fs/btrfs/check-integrity.c | 2 +- fs/

[PATCH 1/5] blkcg: export blkcg_root_css

2017-10-10 Thread Tejun Heo
Export blkcg_root_css so that filesystem modules can use it. Signed-off-by: Tejun Heo --- block/blk-cgroup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index d3f56ba..597a457 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -45,6 +45,7

[PATCHSET] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-10-09 Thread Tejun Heo
Hello, btrfs has different ways to issue metadata IOs and may end up issuing metadata or otherwise shared IOs from a non-root cgroup, which can lead to priority inversion and ineffective IO isolation. This patchset makes sure that btrfs issues all metadata and shared IOs from the root cgroup by e

[PATCH 2/3] cgroup, writeback: implement submit_bh_blkcg_css()

2017-10-09 Thread Tejun Heo
Add wbc->blkcg_css so that the blkcg_css association can be specified independently and implement submit_bh_blkcg_css() using it. This will be used to override cgroup membership on specific buffer_heads. Signed-off-by: Tejun Heo Cc: Jan Kara Cc: Jens Axboe --- fs/buffe

[PATCH 1/3] cgroup, writeback: replace SB_I_CGROUPWB with per-inode S_CGROUPWB

2017-10-09 Thread Tejun Heo
btree_inode which doesn't use btrfs_update_iflags() during initialization. This is an intended behavior change. Signed-off-by: Tejun Heo Cc: Jan Kara Cc: Jens Axboe Cc: Chris Mason Cc: Josef Bacik Cc: linux-btrfs@vger.kernel.org Cc: "Theodore Ts'o" Cc: Andreas

[PATCH 3/3] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-09 Thread Tejun Heo
don't call the function during init; however, this serves as documentation and prevents possible future mistakes. If this isn't desirable, please feel free to drop the section. Signed-off-by: Tejun Heo Cc: Chris Mason Cc: Josef Bacik --- fs/btrfs/check-integrity.c | 2 +- fs/

Re: [PATCH 3/5] writeback: add counters for metadata usage

2016-10-26 Thread Tejun Heo
Hello, Josef. On Wed, Oct 26, 2016 at 11:20:16AM -0400, Josef Bacik wrote: > > > @@ -3701,7 +3703,20 @@ static unsigned long > > > node_pagecache_reclaimable(struct pglist_data *pgdat) > > > if (unlikely(delta > nr_pagecache_reclaimable)) > > > delta = nr_pagecache_reclaimable; > > >

Re: [PATCH 4/5] writeback: introduce super_operations->write_metadata

2016-10-25 Thread Tejun Heo
> into > their ->write_metadata callback. > > Signed-off-by: Josef Bacik > Reviewed-by: Jan Kara Reviewed-by: Tejun Heo > @@ -1491,6 +1516,7 @@ static long writeback_sb_inodes(struct super_block *sb, > unsigned long start_time = jiffies; > long write_chu

Re: [PATCH 3/5] writeback: add counters for metadata usage

2016-10-25 Thread Tejun Heo
Hello, On Tue, Oct 25, 2016 at 02:41:42PM -0400, Josef Bacik wrote: > Btrfs has no bounds except memory on the amount of dirty memory that we have > in > use for metadata. Historically we have used a special inode so we could take > advantage of the balance_dirty_pages throttling that comes with

Re: [PATCH 2/5] writeback: convert WB_WRITTEN/WB_DIRITED counters to bytes

2016-10-25 Thread Tejun Heo
them to count bytes written/dirtied, and allow the > metadata accounting stuff to change the counters as well. > > Signed-off-by: Josef Bacik Acked-by: Tejun Heo A small nit below. > @@ -2547,12 +2547,16 @@ void account_page_redirty(struct page *page) > if (mapping &

Re: [PATCH 1/5] remove mapping from balance_dirty_pages*()

2016-10-25 Thread Tejun Heo
Josef Bacik > Reviewed-by: Jan Kara Acked-by: Tejun Heo Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] writeback: allow for dirty metadata accounting

2016-08-10 Thread Tejun Heo
Hello, Josef. On Wed, Aug 10, 2016 at 05:16:03PM -0400, Josef Bacik wrote: > > It bothers me a bit that sb's can actually be off bdi->sb_list while > > sb_list_lock is released. Can we make this explicit? e.g. keep > > separate bdi sb list for sb's pending metadata writeout (like b_dirty) > > or

Re: [PATCH 2/2] writeback: allow for dirty metadata accounting

2016-08-10 Thread Tejun Heo
Hello, Josef. On Tue, Aug 09, 2016 at 03:08:27PM -0400, Josef Bacik wrote: > Provide a mechanism for file systems to indicate how much dirty metadata they > are holding. This introduces a few things > > 1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY. > 2) WB stat for di

Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()

2016-08-10 Thread Tejun Heo
apping. Since > balance_dirty_pages*() works on a bdi level, just pass in the bdi and super > block directly so we can avoid using mapping. This will allow us to still use > balance_dirty_pages for dirty metadata pages that are not backed by an > address_mapping. > > Signed-off-b

Re: GPF in __mark_inode_dirty due to locked_inode_to_wb_and_lock_list returning NULL

2016-07-13 Thread Tejun Heo
Hello, On Mon, Jul 04, 2016 at 04:15:35PM +0300, Nikolay Borisov wrote: > So the btrfs fs was created inside a loop device and mounted with -o loop. > Evidently from the oops it seems that this is the normal umount path, meaning > that no device hot plugging was in action. Unfortunately I don't

Re: GPF in __mark_inode_dirty due to locked_inode_to_wb_and_lock_list returning NULL

2016-07-01 Thread Tejun Heo
On Fri, Jul 01, 2016 at 12:00:50PM +0200, Jan Kara wrote: > Hello, > > On Thu 30-06-16 14:18:14, Nikolay Borisov wrote: > > In light of the discussion in https://patchwork.kernel.org/patch/9187411/ > > and > > the discussion at > > https://groups.google.com/forum/#!topic/syzkaller/XvxH3cBQ134 >

Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038

2014-02-08 Thread Tejun Heo
Hello, David, Fengguang, Chris. On Fri, Feb 07, 2014 at 01:13:06PM -0800, David Rientjes wrote: > On Fri, 7 Feb 2014, Fengguang Wu wrote: > > > On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote: > > > On Fri, 7 Feb 2014, Fengguang Wu wrote: > > > > > > > [1.625020] BTRFS: selfte

Re: [BUG REPORT] Kernel panic on 3.9.0-rc7-4-gbb33db7

2013-04-18 Thread Tejun Heo
On Thu, Apr 18, 2013 at 10:57:54PM -0700, Tejun Heo wrote: > No wonder this thing crashes. Chris, can't the original bio carry > bbio in bi_private and let end_bio_extent_readpage() free the bbio > instead of abusing bi_bdev like this? BTW, I think it's a bit too late to fix

Re: [BUG REPORT] Kernel panic on 3.9.0-rc7-4-gbb33db7

2013-04-18 Thread Tejun Heo
(cc'ing btrfs people) On Fri, Apr 19, 2013 at 11:33:20AM +0800, Wanlong Gao wrote: > RIP: 0010:[] [] > ftrace_raw_event_block_bio_complete+0x73/0xf0 ... > [] bio_endio+0x80/0x90 > [] btrfs_end_bio+0xf6/0x190 [btrfs] > [] bio_endio+0x3d/0x90 > [] req_bio_endio+0xa3/0xe0 Ugh In fs/btrfs/

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-30 Thread Tejun Heo
Hey, Peter. On Tue, Mar 29, 2011 at 07:37:33PM +0200, Peter Zijlstra wrote: > On Tue, 2011-03-29 at 19:09 +0200, Tejun Heo wrote: > > Here's the combined patch I was planning on testing but didn't get to > > (yet). It implements two things - hard limit on spin duration a

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-29 Thread Tejun Heo
Here's the combined patch I was planning on testing but didn't get to (yet). It implements two things - hard limit on spin duration and early break if the owner also is spinning on a mutex. Thanks. Index: work1/include/linux/sched.h ===

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-29 Thread Tejun Heo
Hello, guys. I've been running dbench 50 for a few days now and the result is, well, I don't know how to call it. The problem was that the original patch didn't do anything because x86 fastpath code didn't call into the generic slowpath at all. static inline int __mutex_fastpath_trylock(atomic

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-25 Thread Tejun Heo
Hello, On Thu, Mar 24, 2011 at 10:41:51AM +0100, Tejun Heo wrote: > USER SYSTEM SIRQCXTSW THROUGHPUT > SIMPLE 61107 354977217 8099529 845.100 MB/sec > SPIN 63140 364888214 6840527 879.077 MB/sec > > On various runs, the adaptive spinning trylo

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-24 Thread Tejun Heo
Hello, Steven, Linus. On Thu, Mar 24, 2011 at 09:38:58PM -0700, Linus Torvalds wrote: > On Thu, Mar 24, 2011 at 8:39 PM, Steven Rostedt wrote: > > > > But now, mutex_trylock(B) becomes a spinner too, and since the B's owner > > is running (spinning on A) it will spin as well waiting for A's owner

[PATCH 3/3] btrfs: Simplify extent_buffer locking

2011-03-24 Thread Tejun Heo
e visibility into btrfs locking. Signed-off-by: Tejun Heo Cc: Peter Zijlstra Cc: Ingo Molnar --- fs/btrfs/Makefile|2 +- fs/btrfs/ctree.c | 16 ++-- fs/btrfs/extent_io.c |3 +- fs/btrfs/extent_io.h | 12 +-- fs/btrfs/locking.c | 233 -

[PATCH 1/3] btrfs: Cleanup extent_buffer lockdep code

2011-03-24 Thread Tejun Heo
btrfs_set_buffer_lockdep_class() should be dependent upon CONFIG_LOCKDEP instead of CONFIG_DEBUG_LOCK_ALLOC. Collect the related code into one place, use CONFIG_LOCKDEP instead and make some cosmetic changes. Signed-off-by: Tejun Heo --- fs/btrfs/disk-io.c | 22 ++ fs

[PATCH 2/3] btrfs: Use separate lockdep class keys for different roots

2011-03-24 Thread Tejun Heo
different sets of keys according to the type of @root. Signed-off-by: Tejun Heo --- fs/btrfs/disk-io.c | 91 +-- fs/btrfs/disk-io.h | 10 -- fs/btrfs/extent-tree.c |2 +- fs/btrfs/volumes.c |2 +- 4 files changed, 73 inserti

[RFC PATCHSET] btrfs: Simplify extent_buffer locking

2011-03-24 Thread Tejun Heo
Hello, This is split patchset of the RFC patches[1] to simplify btrfs locking and contains the following three patches. 0001-btrfs-Cleanup-extent_buffer-lockdep-code.patch 0002-btrfs-Use-separate-lockdep-class-keys-for-different-.patch 0003-btrfs-Simplify-extent_buffer-locking.patch For more

Re: [PATCH 1/2] Subject: mutex: Separate out mutex_spin()

2011-03-24 Thread Tejun Heo
Ugh... Please drop the extra "Subject: " from subject before applying. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-24 Thread Tejun Heo
erence varies but it outperforms consistently. In general, using adaptive spinning on trylock makes sense as trylock failure usually leads to costly unlock-relock sequence. [1] http://article.gmane.org/gmane.comp.file-systems.btrfs/9658 Signed-off-by: Tejun Heo LKML-Reference: <20110323153727.gb1

[PATCH 1/2] Subject: mutex: Separate out mutex_spin()

2011-03-24 Thread Tejun Heo
epare for using adaptive spinning in mutex_trylock() and doesn't cause any behavior change. Signed-off-by: Tejun Heo LKML-Reference: <20110323153727.gb12...@htj.dyndns.org> Cc: Peter Zijlstra Cc: Ingo Molnar --- Here are split patches with SOB. Ingo, it's probably best to route

Re: [RFC PATCH] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-23 Thread Tejun Heo
On Wed, Mar 23, 2011 at 08:48:01AM -0700, Linus Torvalds wrote: > On Wed, Mar 23, 2011 at 8:37 AM, Tejun Heo wrote: > > > > Currently, mutex_trylock() doesn't use adaptive spinning.  It tries > > just once.  I got curious whether using adaptive spinning on > > mut

[RFC PATCH] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-23 Thread Tejun Heo
hether there are pending waiters or not. Is this intended or the test got lost somehow? Thanks. NOT-Signed-off-by: Tejun Heo --- kernel/mutex.c | 98 +++-- 1 file changed, 61 insertions(+), 37 delet

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-23 Thread Tejun Heo
Hello, Chris. On Tue, Mar 22, 2011 at 07:13:09PM -0400, Chris Mason wrote: > Ok, this impact of this is really interesting. If we have very short > waits where there is no IO at all, this patch tends to lose. I ran with > dbench 10 and got about 20% slower tput. > > But, if we do any IO at all

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
Hello, On Mon, Mar 21, 2011 at 01:24:37PM -0400, Chris Mason wrote: > Very interesting. Ok, I'll definitely rerun my benchmarks as well. I > used dbench extensively during the initial tuning, but you're forcing > the memory low in order to force IO. > > This case doesn't really hammer on the lo

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
On Mon, Mar 21, 2011 at 05:59:55PM +0100, Tejun Heo wrote: > I'm running DFL again just in case but SIMPLE or SPIN seems to be a > much better choice. Got 644.176 MB/sec, so yeah the custom locking is definitely worse than just using mutex. Thanks. -- tejun -- To unsubscribe fro

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
52063 470453 1446 3092091 701.826 I'm running DFL again just in case but SIMPLE or SPIN seems to be a much better choice. Thanks. NOT-Signed-off-by: Tejun Heo --- fs/btrfs/locking.h |2 ++ 1 file changed, 2 insertions(+) Index

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
Hello, Chris. On Sun, Mar 20, 2011 at 08:10:51PM -0400, Chris Mason wrote: > I went through a number of benchmarks with the explicit > blocking/spinning code and back then it was still significantly faster > than the adaptive spin. But, it is definitely worth doing these again, > how many dbench

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-20 Thread Tejun Heo
On Sun, Mar 20, 2011 at 08:56:52PM +0100, Tejun Heo wrote: > So, here's the patch to implement and use mutex_try_spin(), which > applies the same owner spin logic to try locking. The result looks > pretty good. > > I re-ran all three. DFL is the current custom locking. S

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-20 Thread Tejun Heo
ume a bit more cpu than SIMPLE but shows discernably better throughput. I'm running SPIN again just in case but the result seems pretty consistent. Thanks. NOT-Signed-off-by: Tejun Heo --- fs/btrfs/locking.h|2 - include/linux/mutex.h |1 kernel/mutex.c

[PATCH RFC] btrfs: Simplify locking

2011-03-20 Thread Tejun Heo
ifference. Thanks. NOT-Signed-off-by: Tejun Heo --- fs/btrfs/Makefile|2 fs/btrfs/ctree.c | 16 +-- fs/btrfs/extent_io.c |3 fs/btrfs/extent_io.h | 12 -- fs/btrfs/locking.c | 233 --- fs/btrfs/locking.h | 43 ++

Re: Some very basic questions

2008-10-22 Thread Tejun Heo
Ric Wheeler wrote: >> FS waiting for completion of all the dependent writes isn't too good >> latency and throughput-wise tho. It would be best if FS can indicate >> dependencies between write commands and barrier so that barrier >> doesn't have to empty the whole queue. Hmm... Can someone tell m

Re: Some very basic questions

2008-10-22 Thread Tejun Heo
Ric Wheeler wrote: > Waiting for the target to ack an IO is not sufficient, since the target > ack does not (with write cache enabled) mean that it is on persistent > storage. FS waiting for completion of all the dependent writes isn't too good latency and throughput-wise tho. It would be best if

Re: Some very basic questions

2008-10-22 Thread Tejun Heo
Ric Wheeler wrote: > I think that we do handle a failure in the case that you outline above > since the FS will be able to notice the error before it sends a commit > down (and that commit is wrapped in the barrier flush calls). This is > the easy case since we still have the context for the IO. I

Re: Some very basic questions

2008-10-22 Thread Tejun Heo
Ric Wheeler wrote: > The cache flush command for ATA devices will block and wait until all of > the device's write cache has been written back. > > What I assume Tejun was referring to here is that some IO might have > been written out to the device and an error happened when the device > tried to