Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec
On 2/19/19 5:17 PM, Ming Lei wrote: On Tue, Feb 19, 2019 at 08:28:19AM -0800, Bart Van Assche wrote: With this patch applied test nvmeof-mp/002 fails as follows: [ 694.700400] kernel BUG at lib/sg_pool.c:103! [ 694.705932] invalid opcode: [#1] PREEMPT SMP KASAN [ 694.708297] CPU: 2 PID: 349 Comm: kworker/2:1H Tainted: GB 5.0.0-rc6-dbg+ #2 [ 694.711730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 694.715113] Workqueue: kblockd blk_mq_run_work_fn [ 694.716894] RIP: 0010:sg_alloc_table_chained+0xe5/0xf0 [ 694.758222] Call Trace: [ 694.759645] nvme_rdma_queue_rq+0x2aa/0xcc0 [nvme_rdma] [ 694.764915] blk_mq_try_issue_directly+0x2a5/0x4b0 [ 694.771779] blk_insert_cloned_request+0x11e/0x1c0 [ 694.778417] dm_mq_queue_rq+0x3d1/0x770 [ 694.793400] blk_mq_dispatch_rq_list+0x5fc/0xb10 [ 694.798386] blk_mq_sched_dispatch_requests+0x2f7/0x300 [ 694.803180] __blk_mq_run_hw_queue+0xd6/0x180 [ 694.808933] blk_mq_run_work_fn+0x27/0x30 [ 694.810315] process_one_work+0x4f1/0xa40 [ 694.813178] worker_thread+0x67/0x5b0 [ 694.814487] kthread+0x1cf/0x1f0 [ 694.819134] ret_from_fork+0x24/0x30 The code in sg_pool.c that triggers the BUG() statement is as follows: int sg_alloc_table_chained(struct sg_table *table, int nents, struct scatterlist *first_chunk) { int ret; BUG_ON(!nents); [ ... ] Bart. I can reproduce this issue("kernel BUG at lib/sg_pool.c:103") without mp-bvec patches, so looks it isn't the fault of this patchset. Thanks Ming for your feedback. Jens, I don't see that issue with kernel v5.0-rc6. Does that mean that the sg_pool BUG() is a regression in your for-next branch that predates Ming's multi-page bvec patch series? Thanks, Bart.
Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec
On Sun, 2019-02-17 at 21:11 +0800, Ming Lei wrote: > The following patch should fix this issue: > > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index bed065904677..066b66430523 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -363,13 +363,15 @@ static unsigned int __blk_recalc_rq_segments(struct > request_queue *q, > struct bio_vec bv, bvprv = { NULL }; > int prev = 0; > unsigned int seg_size, nr_phys_segs; > - unsigned front_seg_size = bio->bi_seg_front_size; > + unsigned front_seg_size; > struct bio *fbio, *bbio; > struct bvec_iter iter; > > if (!bio) > return 0; > > + front_seg_size = bio->bi_seg_front_size; > + > switch (bio_op(bio)) { > case REQ_OP_DISCARD: > case REQ_OP_SECURE_ERASE: Hi Ming, With this patch applied test nvmeof-mp/002 fails as follows: [ 694.700400] kernel BUG at lib/sg_pool.c:103! [ 694.705932] invalid opcode: [#1] PREEMPT SMP KASAN [ 694.708297] CPU: 2 PID: 349 Comm: kworker/2:1H Tainted: GB 5.0.0-rc6-dbg+ #2 [ 694.711730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 694.715113] Workqueue: kblockd blk_mq_run_work_fn [ 694.716894] RIP: 0010:sg_alloc_table_chained+0xe5/0xf0 [ 694.758222] Call Trace: [ 694.759645] nvme_rdma_queue_rq+0x2aa/0xcc0 [nvme_rdma] [ 694.764915] blk_mq_try_issue_directly+0x2a5/0x4b0 [ 694.771779] blk_insert_cloned_request+0x11e/0x1c0 [ 694.778417] dm_mq_queue_rq+0x3d1/0x770 [ 694.793400] blk_mq_dispatch_rq_list+0x5fc/0xb10 [ 694.798386] blk_mq_sched_dispatch_requests+0x2f7/0x300 [ 694.803180] __blk_mq_run_hw_queue+0xd6/0x180 [ 694.808933] blk_mq_run_work_fn+0x27/0x30 [ 694.810315] process_one_work+0x4f1/0xa40 [ 694.813178] worker_thread+0x67/0x5b0 [ 694.814487] kthread+0x1cf/0x1f0 [ 694.819134] ret_from_fork+0x24/0x30 The code in sg_pool.c that triggers the BUG() statement is as follows: int sg_alloc_table_chained(struct sg_table *table, int nents, struct scatterlist *first_chunk) { int ret; BUG_ON(!nents); [ ... ] Bart.
Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec
On Fri, 2019-02-15 at 08:49 -0700, Jens Axboe wrote: > On 2/15/19 4:13 AM, Ming Lei wrote: > > This patchset brings multi-page bvec into block layer: > > Applied, thanks Ming. Let's hope it sticks! Hi Jens and Ming, Test nvmeof-mp/002 fails with Jens' for-next branch from this morning. I have not yet tried to figure out which patch introduced the failure. Anyway, this is what I see in the kernel log for test nvmeof-mp/002: [ 475.611363] BUG: unable to handle kernel NULL pointer dereference at 0020 [ 475.621188] #PF error: [normal kernel read fault] [ 475.623148] PGD 0 P4D 0 [ 475.624737] Oops: [#1] PREEMPT SMP KASAN [ 475.626628] CPU: 1 PID: 277 Comm: kworker/1:1H Tainted: GB 5.0.0-rc6-dbg+ #1 [ 475.630232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 475.633855] Workqueue: kblockd blk_mq_requeue_work [ 475.635777] RIP: 0010:__blk_recalc_rq_segments+0xbe/0x590 [ 475.670948] Call Trace: [ 475.693515] blk_recalc_rq_segments+0x2f/0x50 [ 475.695081] blk_insert_cloned_request+0xbb/0x1c0 [ 475.701142] dm_mq_queue_rq+0x3d1/0x770 [ 475.707225] blk_mq_dispatch_rq_list+0x5fc/0xb10 [ 475.717137] blk_mq_sched_dispatch_requests+0x256/0x300 [ 475.721767] __blk_mq_run_hw_queue+0xd6/0x180 [ 475.725920] __blk_mq_delay_run_hw_queue+0x25c/0x290 [ 475.727480] blk_mq_run_hw_queue+0x119/0x1b0 [ 475.732019] blk_mq_run_hw_queues+0x7b/0xa0 [ 475.733468] blk_mq_requeue_work+0x2cb/0x300 [ 475.736473] process_one_work+0x4f1/0xa40 [ 475.739424] worker_thread+0x67/0x5b0 [ 475.741751] kthread+0x1cf/0x1f0 [ 475.746034] ret_from_fork+0x24/0x30 (gdb) list *(__blk_recalc_rq_segments+0xbe) 0x816a152e is in __blk_recalc_rq_segments (block/blk-merge.c:366). 361 struct bio *bio) 362 { 363 struct bio_vec bv, bvprv = { NULL }; 364 int prev = 0; 365 unsigned int seg_size, nr_phys_segs; 366 unsigned front_seg_size = bio->bi_seg_front_size; 367 struct bio *fbio, *bbio; 368 struct bvec_iter iter; 369 370 if (!bio) Bart.
Re: [PATCH] btrfs: Fix a C compliance issue
On Wed, 2018-06-20 at 13:19 -0400, Jeff Mahoney wrote: > The shed should be yellow. > > -Jeff > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index 891cd2ed5dd4..57c9da0b459f 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -2375,21 +2375,20 @@ static __cold void btrfs_interface_exit(void) > > static void __init btrfs_print_mod_info(void) > { > - pr_info("Btrfs loaded, crc32c=%s" > + pr_info("Btrfs loaded, crc32c=%s", crc32c_impl()); > #ifdef CONFIG_BTRFS_DEBUG > - ", debug=on" > + pr_cont(", debug=on"); > #endif > #ifdef CONFIG_BTRFS_ASSERT > - ", assert=on" > + pr_cont(", assert=on"); > #endif > #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY > - ", integrity-checker=on" > + pr_cont(", integrity-checker=on"); > #endif > #ifdef CONFIG_BTRFS_FS_REF_VERIFY > - ", ref-verify=on" > + pr_cont(", ref-verify=on") > #endif > - "\n", > - crc32c_impl()); > + pr_cont("\n"); > } > > static int null_open(struct block_device *bdev, fmode_t mode) Since we are doing bikeshedding, let me contribute to it :-) From scripts/checkpatch.pl: if ($line =~ /\bprintk\s*\(\s*KERN_CONT\b|\bpr_cont\s*\(/) { WARN("LOGGING_CONTINUATION", "Avoid logging continuation uses where feasible\n" . $herecurr); } Bart.
[PATCH v2 0/3] Three patches that address static analyzer reports
Hello Chris and Josef, The three patches in this series address complaints reported by static analyzers (gcc + W=1, sparse, smatch). These patches do not change any functionality. Please consider these for inclusion in the upstream kernel. Thanks, Bart. Bart Van Assche (3): btrfs: Fix indentation btrfs: Annotate fall-through btrfs: Fix a C compliance issue fs/btrfs/extent-tree.c | 4 ++-- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/reada.c | 2 +- fs/btrfs/super.c | 7 --- 4 files changed, 9 insertions(+), 8 deletions(-) -- 2.17.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/3] btrfs: Annotate fall-through
This patch avoids that the compiler complains that a fall-through annotation is missing when building with W=1. Signed-off-by: Bart Van Assche --- fs/btrfs/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 81107ad49f3a..3e298f26a383 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -760,6 +760,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, case Opt_recovery: btrfs_warn(info, "'recovery' is deprecated, use 'usebackuproot' instead"); + /* fall through */ case Opt_usebackuproot: btrfs_info(info, "trying to use backup root at mount time"); -- 2.17.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/3] btrfs: Fix indentation
This patch avoids that building the BTRFS source code with smatch triggers complaints about inconsistent indenting. Signed-off-by: Bart Van Assche --- fs/btrfs/extent-tree.c | 4 ++-- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/reada.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 3d9fe58c0080..db46ceb62b3f 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6279,7 +6279,7 @@ static int update_block_group(struct btrfs_trans_handle *trans, if (list_empty(&cache->dirty_list)) { list_add_tail(&cache->dirty_list, &trans->transaction->dirty_bgs); - trans->transaction->num_dirty_bgs++; + trans->transaction->num_dirty_bgs++; btrfs_get_block_group(cache); } spin_unlock(&trans->transaction->dirty_bgs_lock); @@ -7534,7 +7534,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info, * for the proper type. */ if (!block_group_bits(block_group, flags)) { - u64 extra = BTRFS_BLOCK_GROUP_DUP | + u64 extra = BTRFS_BLOCK_GROUP_DUP | BTRFS_BLOCK_GROUP_RAID1 | BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6 | diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index c2837a32d689..93b23549ee1e 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2507,8 +2507,8 @@ static int btrfs_search_path_in_tree_user(struct inode *inode, static noinline int btrfs_ioctl_ino_lookup(struct file *file, void __user *argp) { -struct btrfs_ioctl_ino_lookup_args *args; -struct inode *inode; + struct btrfs_ioctl_ino_lookup_args *args; + struct inode *inode; int ret = 0; args = memdup_user(argp, sizeof(*args)); diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c index 40f1bcef394d..4be425f70c2d 100644 --- a/fs/btrfs/reada.c +++ b/fs/btrfs/reada.c @@ -355,7 +355,7 @@ static struct reada_extent *reada_find_extent(struct btrfs_fs_info *fs_info, dev = bbio->stripes[nzones].dev; /* cannot read ahead on missing device. */ -if (!dev->bdev) + if (!dev->bdev) continue; zone = reada_find_zone(dev, logical, bbio); -- 2.17.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/3] btrfs: Fix a C compliance issue
The C programming language does not allow to use preprocessor statements inside macro arguments (pr_info() is defined as a macro). Hence rework the pr_info() statement in btrfs_print_mod_info() such that it becomes compliant. This patch allows tools like sparse to analyze the BTRFS source code. Fixes: 62e855771dac ("btrfs: convert printk(KERN_* to use pr_* calls") Signed-off-by: Bart Van Assche Cc: Jeff Mahoney Cc: David Sterba Cc: Nikolay Borisov --- fs/btrfs/super.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 3e298f26a383..972d9fbd7e96 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2370,7 +2370,7 @@ static __cold void btrfs_interface_exit(void) static void __init btrfs_print_mod_info(void) { - pr_info("Btrfs loaded, crc32c=%s" + static const char options[] = #ifdef CONFIG_BTRFS_DEBUG ", debug=on" #endif @@ -2383,8 +2383,8 @@ static void __init btrfs_print_mod_info(void) #ifdef CONFIG_BTRFS_FS_REF_VERIFY ", ref-verify=on" #endif - "\n", - crc32c_impl()); + ; + pr_info("Btrfs loaded, crc32c=%s%s\n", crc32c_impl(), options); } static int __init init_btrfs_fs(void) -- 2.17.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: Fix a C compliance issue
On Mon, 2018-06-18 at 12:31 +0300, Nikolay Borisov wrote: > On 18.06.2018 12:26, David Sterba wrote: > > On Sat, Jun 16, 2018 at 01:28:13PM +0300, Nikolay Borisov wrote: > > > I'd rather not see more printk being added. Nothing prevents from having > > > the fmt string being passed to pr_info. > > > > So you mean to do > > > > + static const char fmt[] = "Btrfs loaded, crc32c=%s" > > + pr_info(fmt); > > Pretty much, something along the lines of > > pr_info(fmt, crc32c_impl). > > printk requires having the KERN_INFO in the format string, which I see > no point in doing, correct me if I'm wrong? You should know that what you proposed doesn't compile because pr_info() relies on string concatenation and hence requires that its first argument is a string constant instead of a const char pointer. Anyway, I will rework this patch such that it uses pr_info() instead of printk(). Bart.
[PATCH] btrfs: Fix a C compliance issue
The C programming language does not allow to use preprocessor statements inside macro arguments (pr_info() is defined as a macro). Hence rework the pr_info() statement in btrfs_print_mod_info() such that it becomes compliant. This patch allows tools like sparse to analyze the BTRFS source code. Fixes: 62e855771dac ("btrfs: convert printk(KERN_* to use pr_* calls") Signed-off-by: Bart Van Assche Cc: Jeff Mahoney Cc: David Sterba --- fs/btrfs/super.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 81107ad49f3a..dd4980df5b8e 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2369,7 +2369,7 @@ static __cold void btrfs_interface_exit(void) static void __init btrfs_print_mod_info(void) { - pr_info("Btrfs loaded, crc32c=%s" + static const char fmt[] = KERN_INFO "Btrfs loaded, crc32c=%s" #ifdef CONFIG_BTRFS_DEBUG ", debug=on" #endif @@ -2382,8 +2382,8 @@ static void __init btrfs_print_mod_info(void) #ifdef CONFIG_BTRFS_FS_REF_VERIFY ", ref-verify=on" #endif - "\n", - crc32c_impl()); + "\n"; + printk(fmt, crc32c_impl()); } static int __init init_btrfs_fs(void) -- 2.17.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v5] blk-mq: reimplement timeout handling
On Fri, 2018-01-12 at 14:07 -0700, Jens Axboe wrote: > You're really not making it easy for folks to run this :-) My hope is that the ib_srp and ib_srpt patches will be accepted upstream soon. As long as these are not upstream, anyone who wants to retrieve these patches is welcome to clone https://github.com/bvanassche/linux/tree/block-scsi-for-next, a kernel tree with all my pending patches. > Do you have the matching blk-mq debugfs output for the device? Sorry but I only retrieved the blk-mq debugfs several minutes after the hang started so I'm not sure the state information is relevant. Anyway, I have attached it to this e-mail. The most remarkable part is the following: ./9ddfa913/requeue_list:9646711c {.op=READ, .state=idle, gen=0x1 18, abort_gen=0x0, .cmd_flags=, .rq_flags=SORTED|1|SOFTBARRIER|IO_STAT, complete =0, .tag=-1, .internal_tag=217} The hexadecimal number at the start is the request_queue pointer (I modified the blk-mq-debugfs code such that queues are registered with there address just after creation and until a name is assigned). This is a dm-mpath queue. Bart../9ddfa913/hctx0/cpu3/completed:29 0 ./9ddfa913/hctx0/cpu3/merged:0 ./9ddfa913/hctx0/cpu3/dispatched:30 0 ./9ddfa913/hctx0/cpu2/completed:0 0 ./9ddfa913/hctx0/cpu2/merged:0 ./9ddfa913/hctx0/cpu2/dispatched:0 0 ./9ddfa913/hctx0/cpu1/completed:0 0 ./9ddfa913/hctx0/cpu1/merged:0 ./9ddfa913/hctx0/cpu1/dispatched:0 0 ./9ddfa913/hctx0/cpu0/completed:0 0 ./9ddfa913/hctx0/cpu0/merged:0 ./9ddfa913/hctx0/cpu0/dispatched:0 0 ./9ddfa913/hctx0/active:0 ./9ddfa913/hctx0/run:92 ./9ddfa913/hctx0/queued:30 ./9ddfa913/hctx0/dispatched: 00 ./9ddfa913/hctx0/dispatched: 197 ./9ddfa913/hctx0/dispatched: 20 ./9ddfa913/hctx0/dispatched: 40 ./9ddfa913/hctx0/dispatched: 80 ./9ddfa913/hctx0/dispatched: 160 ./9ddfa913/hctx0/dispatched: 32+ 0 ./9ddfa913/hctx0/io_poll:considered=0 ./9ddfa913/hctx0/io_poll:invoked=0 ./9ddfa913/hctx0/io_poll:success=0 ./9ddfa913/hctx0/sched_tags_bitmap:: ./9ddfa913/hctx0/sched_tags_bitmap:0010: 0002 ./9ddfa913/hctx0/sched_tags:nr_tags=256 ./9ddfa913/hctx0/sched_tags:nr_reserved_tags=0 ./9ddfa913/hctx0/sched_tags:active_queues=0 ./9ddfa913/hctx0/sched_tags: ./9ddfa913/hctx0/sched_tags:bitmap_tags: ./9ddfa913/hctx0/sched_tags:depth=256 ./9ddfa913/hctx0/sched_tags:busy=1 ./9ddfa913/hctx0/sched_tags:bits_per_word=64 ./9ddfa913/hctx0/sched_tags:map_nr=4 ./9ddfa913/hctx0/sched_tags:alloc_hint={45, 56, 144, 218} ./9ddfa913/hctx0/sched_tags:wake_batch=8 ./9ddfa913/hctx0/sched_tags:wake_index=0 ./9ddfa913/hctx0/sched_tags:ws={ ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive}, ./9ddfa913/hctx0/sched_tags:} ./9ddfa913/hctx0/sched_tags:round_robin=0 ./9ddfa913/hctx0/tags_bitmap:: ./9ddfa913/hctx0/tags_bitmap:0010: ./9ddfa913/hctx0/tags_bitmap:0020: ./9ddfa913/hctx0/tags_bitmap:0030: ./9ddfa913/hctx0/tags_bitmap:0040: ./9ddfa913/hctx0/tags_bitmap:0050: ./9ddfa913/hctx0/tags_bitmap:0060: ./9ddfa913/hctx0/tags_bitmap:0070: ./9ddfa913/hctx0/tags_bitmap:0080: ./9ddfa913/hctx0/tags_bitmap:0090: ./9ddfa913/hctx0/tags_bitmap:00a0: ./9ddfa913/hctx0/tags_bitmap:00b0: ./9ddfa913/hctx0/tags_bitmap:00c0: ./9ddfa913/hctx0/tags_bitmap:00d0: ./9ddfa913/hctx0/tags_bitmap:00e0:
Re: [PATCHSET v5] blk-mq: reimplement timeout handling
On Tue, 2018-01-09 at 08:29 -0800, Tejun Heo wrote: > Currently, blk-mq timeout path synchronizes against the usual > issue/completion path using a complex scheme involving atomic > bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence > rules. Unfortunatley, it contains quite a few holes. Hello Tejun, With this patch series applied I see weird hangs in blk_mq_get_tag() when I run the srp-test software. If I pull Jens' latest for-next branch and revert this patch series then the srp-test software runs successfully. Note: if you don't have InfiniBand hardware available then you will need the RDMA/CM patches for the SRP initiator and target drivers that have been posted recently on the linux-rdma mailing list to run the srp-test software. This is how I run the srp-test software in a VM: ./run_tests -c -d -r 10 Here is an example of what SysRq-w reported when the hang occurred: sysrq: SysRq : Show Blocked State taskPC stack pid father kworker/u8:0D12864 5 2 0x8000 Workqueue: events_unbound sd_probe_async [sd_mod] Call Trace: ? __schedule+0x2b4/0xbb0 schedule+0x2d/0x90 io_schedule+0xd/0x30 blk_mq_get_tag+0x169/0x290 ? finish_wait+0x80/0x80 blk_mq_get_request+0x16a/0x4f0 blk_mq_alloc_request+0x59/0xc0 blk_get_request_flags+0x3f/0x260 scsi_execute+0x33/0x1e0 [scsi_mod] read_capacity_16.part.35+0x9c/0x460 [sd_mod] sd_revalidate_disk+0x14bb/0x1cb0 [sd_mod] sd_probe_async+0xf2/0x1a0 [sd_mod] process_one_work+0x21c/0x6d0 worker_thread+0x35/0x380 ? process_one_work+0x6d0/0x6d0 kthread+0x117/0x130 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x24/0x30 systemd-udevd D13672 1048285 0x0100 Call Trace: ? __schedule+0x2b4/0xbb0 schedule+0x2d/0x90 io_schedule+0xd/0x30 generic_file_read_iter+0x32f/0x970 ? page_cache_tree_insert+0x100/0x100 __vfs_read+0xcc/0x120 vfs_read+0x96/0x140 SyS_read+0x40/0xa0 do_syscall_64+0x5f/0x1b0 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f8ce6d08d11 RSP: 002b:7fff96dec288 EFLAGS: 0246 ORIG_RAX: RAX: ffda RBX: 5651de7f6e10 RCX: 7f8ce6d08d11 RDX: 0040 RSI: 5651de7f6e38 RDI: 0007 RBP: 5651de7ea500 R08: 7f8ce6cf1c20 R09: 5651de7f6e10 R10: 006f R11: 0246 R12: 01ff R13: 01ff0040 R14: 5651de7ea550 R15: 0040 systemd-udevd D13496 1049285 0x0100 Call Trace: ? __schedule+0x2b4/0xbb0 schedule+0x2d/0x90 io_schedule+0xd/0x30 blk_mq_get_tag+0x169/0x290 ? finish_wait+0x80/0x80 blk_mq_get_request+0x16a/0x4f0 blk_mq_make_request+0x105/0x8e0 ? generic_make_request+0xd6/0x3d0 generic_make_request+0x103/0x3d0 ? submit_bio+0x57/0x110 submit_bio+0x57/0x110 mpage_readpages+0x13b/0x160 ? I_BDEV+0x10/0x10 ? rcu_read_lock_sched_held+0x66/0x70 ? __alloc_pages_nodemask+0x2e8/0x360 __do_page_cache_readahead+0x2a4/0x370 ? force_page_cache_readahead+0xaf/0x110 force_page_cache_readahead+0xaf/0x110 generic_file_read_iter+0x743/0x970 ? find_held_lock+0x2d/0x90 ? _raw_spin_unlock+0x29/0x40 __vfs_read+0xcc/0x120 vfs_read+0x96/0x140 SyS_read+0x40/0xa0 do_syscall_64+0x5f/0x1b0 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f8ce6d08d11 RSP: 002b:7fff96dec8b8 EFLAGS: 0246 ORIG_RAX: RAX: ffda RBX: 7f8ce7085010 RCX: 7f8ce6d08d11 RDX: 0004 RSI: 7f8ce7085038 RDI: 000f RBP: 5651de7ec840 R08: R09: 7f8ce7085010 R10: 7f8ce7085028 R11: 0246 R12: R13: 0004 R14: 5651de7ec890 R15: 0004 systemd-udevd D13672 1055285 0x0100 Call Trace: ? __schedule+0x2b4/0xbb0 schedule+0x2d/0x90 io_schedule+0xd/0x30 blk_mq_get_tag+0x169/0x290 ? finish_wait+0x80/0x80 blk_mq_get_request+0x16a/0x4f0 blk_mq_make_request+0x105/0x8e0 ? generic_make_request+0xd6/0x3d0 generic_make_request+0x103/0x3d0 ? submit_bio+0x57/0x110 submit_bio+0x57/0x110 mpage_readpages+0x13b/0x160 ? I_BDEV+0x10/0x10 ? rcu_read_lock_sched_held+0x66/0x70 ? __alloc_pages_nodemask+0x2e8/0x360 __do_page_cache_readahead+0x2a4/0x370 ? force_page_cache_readahead+0xaf/0x110 force_page_cache_readahead+0xaf/0x110 generic_file_read_iter+0x743/0x970 __vfs_read+0xcc/0x120 vfs_read+0x96/0x140 SyS_read+0x40/0xa0 do_syscall_64+0x5f/0x1b0 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f8ce6d08d11 RSP: 002b:7fff96dec848 EFLAGS: 0246 ORIG_RAX: RAX: ffda RBX: 5651de7ec300 RCX: 7f8ce6d08d11 RDX: 0100 RSI: 5651de7ec328 RDI: 000f RBP: 5651de7ea500 R08: R09: 5651de7ec300 R10: 5651de7ec318 R11: 0246 R12: 01ffe000 R13: 01ffe100 R14: 5651de7ea550 R15: 0100 Please let me know if you need more information. Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 2/8] blk-mq: protect completion path with RCU
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > Currently, blk-mq protects only the issue path with RCU. This patch > puts the completion path under the same RCU protection. This will be > used to synchronize issue/completion against timeout by later patches, > which will also add the comments. > > Signed-off-by: Tejun Heo > --- > block/blk-mq.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index ddc9261..6741c3e 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -584,11 +584,16 @@ static void hctx_lock(struct blk_mq_hw_ctx *hctx, int > *srcu_idx) > void blk_mq_complete_request(struct request *rq) > { > struct request_queue *q = rq->q; > + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu); > + int srcu_idx; > > if (unlikely(blk_should_fake_timeout(q))) > return; > + > + hctx_lock(hctx, &srcu_idx); > if (!blk_mark_rq_complete(rq)) > __blk_mq_complete_request(rq); > + hctx_unlock(hctx, srcu_idx); > } > EXPORT_SYMBOL(blk_mq_complete_request); Hello Tejun, I'm concerned about the additional CPU cycles needed for the new blk_mq_map_queue() call, although I know this call is cheap. Would the timeout code really get that much more complicated if the hctx_lock() and hctx_unlock() calls would be changed into rcu_read_lock() and rcu_read_unlock() calls? Would it be sufficient if "if (has_rcu) synchronize_rcu();" would be changed into "synchronize_rcu();" in blk_mq_timeout_work()? Thanks, Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > The RCU protection has been expanded to cover both queueing and > completion paths making ->queue_rq_srcu a misnomer. Rename it to > ->srcu as suggested by Bart. Reviewed-by: Bart Van Assche
Re: [PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > After the recent updates to use generation number and state based > synchronization, we can easily replace REQ_ATOM_STARTED usages by > adding an extra state to distinguish completed but not yet freed > state. > > Add MQ_RQ_COMPLETE and replace REQ_ATOM_STARTED usages with > blk_mq_rq_state() tests. REQ_ATOM_STARTED no longer has any users > left and is removed. Reviewed-by: Bart Van Assche N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > After the recent updates to use generation number and state based > synchronization, blk-mq no longer depends on REQ_ATOM_COMPLETE except > to avoid firing the same timeout multiple times. > > Remove all REQ_ATOM_COMPLETE usages and use a new rq_flags flag > RQF_MQ_TIMEOUT_EXPIRED to avoid firing the same timeout multiple > times. This removes atomic bitops from hot paths too. > > v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). > > v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. I think it's unfortunate that this patch spreads the request state over two struct request members, namely the lower two bits of gstate and the RQF_MQ_TIMEOUT_EXPIRED flag in req->rq_flags. Please consider to drop the RQF_MQ_TIMEOUT_EXPIRED flag and to represent the "timeout has been processed" state as a MQ_RQ_* state in gstate. That will avoid that every state update has to be reviewed whether or not perhaps an update of the RQF_MQ_TIMEOUT_EXPIRED flag is missing. Thanks, Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > @@ -230,6 +232,27 @@ struct request { > > unsigned short write_hint; > > + /* > + * On blk-mq, the lower bits of ->gstate carry the MQ_RQ_* state > + * value and the upper bits the generation number which is > + * monotonically incremented and used to distinguish the reuse > + * instances. > + * > + * ->gstate_seq allows updates to ->gstate and other fields > + * (currently ->deadline) during request start to be read > + * atomically from the timeout path, so that it can operate on a > + * coherent set of information. > + */ > + seqcount_t gstate_seq; > + u64 gstate; > + > + /* > + * ->aborted_gstate is used by the timeout to claim a specific > + * recycle instance of this request. See blk_mq_timeout_work(). > + */ > + struct u64_stats_sync aborted_gstate_sync; > + u64 aborted_gstate; > + > unsigned long deadline; > struct list_head timeout_list; Does "gstate" perhaps stand for "generation number and state"? If so, please mention this in one of the above comments. Thanks, Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > @@ -156,12 +156,12 @@ void blk_timeout_work(struct work_struct *work) > */ > void blk_abort_request(struct request *req) > { > - if (blk_mark_rq_complete(req)) > - return; > - > if (req->q->mq_ops) { > - blk_mq_rq_timed_out(req, false); > + req->deadline = jiffies; > + mod_timer(&req->q->timeout, 0); > } else { > + if (blk_mark_rq_complete(req)) > + return; > blk_delete_timer(req); > blk_rq_timed_out(req); > } Other req->deadline writes are protected by preempt_disable(), write_seqcount_begin(&rq->gstate_seq), write_seqcount_end(&rq->gstate_seq) and preempt_enable(). I think it's fine that the above req->deadline store does not have that protection but I also think that that deserves a comment. Thanks, Bart.
Re: [PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > blk_mq_check_inflight() and blk_mq_poll_hybrid_sleep() test > REQ_ATOM_COMPLETE to determine the request state. Both uses are > speculative and we can test REQ_ATOM_STARTED and blk_mq_rq_state() for > equivalent results. Replace the tests. This will allow removing > REQ_ATOM_COMPLETE usages from blk-mq. Reviewed-by: Bart Van Assche
Re: [PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > +static void blk_mq_rq_update_aborted_gstate(struct request *rq, u64 gstate) > +{ > + unsigned long flags; > + > + local_irq_save(flags); > + u64_stats_update_begin(&rq->aborted_gstate_sync); > + rq->aborted_gstate = gstate; > + u64_stats_update_end(&rq->aborted_gstate_sync); > + local_irq_restore(flags); > +} Please add a comment that explains the purpose of local_irq_save() and local_irq_restore(). Please also explain why you chose to disable interrupts instead of disabling preemption. I think that disabling preemption should be sufficient since this is the only code that updates rq->aborted_gstate and since this function is always called from thread context. > @@ -801,6 +840,12 @@ void blk_mq_rq_timed_out(struct request *req, bool > reserved) > __blk_mq_complete_request(req); > break; > case BLK_EH_RESET_TIMER: > + /* > + * As nothing prevents from completion happening while > + * ->aborted_gstate is set, this may lead to ignored > + * completions and further spurious timeouts. > + */ > + blk_mq_rq_update_aborted_gstate(req, 0); > blk_add_timer(req); > blk_clear_rq_complete(req); > break; Is the race that the comment refers to addressed by one of the later patches? Thanks, Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 1/8] blk-mq: move hctx lock/unlock into a helper
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote: > +static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx) > +{ > + if (!(hctx->flags & BLK_MQ_F_BLOCKING)) > + rcu_read_unlock(); > + else > + srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx); > +} > + > +static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx) > +{ > + if (!(hctx->flags & BLK_MQ_F_BLOCKING)) > + rcu_read_lock(); > + else > + *srcu_idx = srcu_read_lock(hctx->queue_rq_srcu); > +} A minor comment: please consider to reorder these two functions such that the lock function appears first and the unlock function second. Anyway: Reviewed-by: Bart Van Assche
fs_reclaim() deadlock complaint
Hello, Since I started testing kernel v4.14-rc1 I see a deadlock complaint appearing in the kernel log during boot. Is this a known issue? $ cat /etc/fstab # /dev/mapper/vg-root / btrfs defaults,subvol=@ 0 1 /dev/mapper/vg-boot /boot ext4defaults0 2 /dev/mapper/vg-root /home btrfs defaults,subvol=@home 0 2 UUID=5217d83f-213e-4b42-b86e-20013325ba6c none swapsw 0 0 From the system log: systemd[1]: Started Session c5 of user root. kernel: kernel: kernel: WARNING: possible recursive locking detected kernel: 4.14.0-rc3-dbg+ #11 Not tainted kernel: kernel: dpkg/10251 is trying to acquire lock: kernel: (fs_reclaim){+.+.}, at: [] fs_reclaim_acquire.part.85+0x0/0x30 kernel: kernel: but task is already holding lock: kernel: (fs_reclaim){+.+.}, at: [] fs_reclaim_acquire.part.85+0x0/0x30 kernel: kernel: other info that might help us debug this: kernel: Possible unsafe locking scenario: kernel: kernel: CPU0 kernel: kernel: lock(fs_reclaim); kernel: lock(fs_reclaim); kernel: kernel: *** DEADLOCK *** kernel: kernel: May be due to missing lock nesting notation kernel: kernel: 2 locks held by dpkg/10251: kernel: #0: (&mm->mmap_sem){}, at: [] __do_page_fault+0x11f/0x460 kernel: #1: (fs_reclaim){+.+.}, at: [] fs_reclaim_acquire.part.85+0x0/0x30 kernel: kernel: stack backtrace: kernel: CPU: 1 PID: 10251 Comm: dpkg Not tainted 4.14.0-rc3-dbg+ #11 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 kernel: Call Trace: kernel: dump_stack+0x86/0xcf kernel: __lock_acquire+0x720/0x1440 kernel: ? unwind_get_return_address+0x1a/0x30 kernel: ? __bfs+0x12e/0x210 kernel: lock_acquire+0xf5/0x200 kernel: ? lock_acquire+0xf5/0x200 kernel: ? pageset_set_high_and_batch+0x90/0x90 kernel: ? alloc_extent_state+0x1f/0x1a0 [btrfs] kernel: fs_reclaim_acquire.part.85+0x24/0x30 kernel: ? pageset_set_high_and_batch+0x90/0x90 kernel: fs_reclaim_acquire+0x14/0x20 kernel: kmem_cache_alloc+0x2a/0x2c0 kernel: ? lock_acquire+0xf5/0x200 kernel: alloc_extent_state+0x1f/0x1a0 [btrfs] kernel: __clear_extent_bit+0x26f/0x3f0 [btrfs] kernel: ? test_range_bit+0xd1/0x100 [btrfs] kernel: try_release_extent_mapping+0x192/0x1e0 [btrfs] kernel: __btrfs_releasepage+0x2f/0x70 [btrfs] kernel: ? page_get_anon_vma+0x170/0x170 kernel: btrfs_releasepage+0x3c/0x40 [btrfs] kernel: try_to_release_page+0x4e/0x60 kernel: shrink_page_list+0xbc8/0x1010 kernel: ? trace_hardirqs_on_caller+0xf4/0x190 kernel: shrink_inactive_list+0x1a9/0x520 kernel: shrink_node_memcg.constprop.80+0x4ae/0x750 kernel: ? rcu_read_lock_sched_held+0x45/0x80 kernel: shrink_node+0x45/0x1a0 kernel: ? shrink_node+0x45/0x1a0 kernel: do_try_to_free_pages+0xd1/0x2c0 kernel: try_to_free_pages+0xed/0x330 kernel: ? pageset_set_high_and_batch+0x90/0x90 kernel: __alloc_pages_slowpath+0x454/0x1220 kernel: __alloc_pages_nodemask+0x273/0x300 kernel: mm_get_huge_zero_page+0x7f/0xf0 kernel: do_huge_pmd_anonymous_page+0x301/0x500 kernel: __handle_mm_fault+0xb42/0xd70 kernel: handle_mm_fault+0x8d/0x100 kernel: __do_page_fault+0x23f/0x460 kernel: ? trace_hardirqs_on_caller+0xf4/0x190 kernel: do_page_fault+0x2e/0x280 kernel: do_async_page_fault+0x14/0x60 kernel: async_page_fault+0x22/0x30 kernel: RIP: 0033:0x55704b9c5263 kernel: RSP: 002b:7ffc8326c4a0 EFLAGS: 00010202 kernel: RAX: 55704c390f10 RBX: RCX: 7f8ec8985b00 kernel: RDX: 55704c390f10 RSI: 55704c390f40 RDI: kernel: RBP: 55704b9d2350 R08: 7f8ec8987760 R09: 0040 kernel: R10: 7f8ec8985b58 R11: 0202 R12: 55704b9ab410 kernel: R13: 7ffc8326c5f0 R14: R15: Thanks, Bart.
Re: [PATCH 06/15] fs: simplify dio_bio_complete
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote: > Only read bio->bi_error once in the common path. Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] fs: remove the unused error argument to dio_end_io()
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote: > Signed-off-by: Christoph Hellwig Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/15] dm: fix REQ_RAHEAD handling
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote: > A few (but not all) dm targets use a special EWOULDBLOCK error code for > failing REQ_RAHEAD requests that fail due to a lack of available resources. > But no one else knows about this magic code, and lower level drivers also > don't generate it when failing read-ahead requests for similar reasons. > > So remove this special casing and ignore all additional error handling for > REQ_RAHEAD - if this was a real underlying error we'd get a normal read > once the real read comes in. Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/15] gfs2: remove the unused sd_log_error field
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote: > Signed-off-by: Christoph Hellwig Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/15] scsi/osd: don't save block errors into req_results
On Thu, 2017-05-18 at 15:17 +0200, Christoph Hellwig wrote: > We will only have sense data if the command exectured and got a SCSI > result, so this is pointless. > > Signed-off-by: Christoph Hellwig > --- > drivers/scsi/osd/osd_initiator.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/osd/osd_initiator.c > b/drivers/scsi/osd/osd_initiator.c > index 8a1b94816419..14785177ce7b 100644 > --- a/drivers/scsi/osd/osd_initiator.c > +++ b/drivers/scsi/osd/osd_initiator.c > @@ -477,7 +477,7 @@ static void _set_error_resid(struct osd_request *or, > struct request *req, >int error) > { > or->async_error = error; > - or->req_errors = scsi_req(req)->result ? : error; > + or->req_errors = scsi_req(req)->result; > or->sense_len = scsi_req(req)->sense_len; > if (or->sense_len) > memcpy(or->sense, scsi_req(req)->sense, or->sense_len); Hello Christoph, Are you sure that that code is not necessary? From osd_initiator.c: static void _put_request(struct request *rq) { /* * If osd_finalize_request() was called but the request was not * executed through the block layer, then we must release BIOs. * TODO: Keep error code in or->async_error. Need to audit all * code paths. */ if (unlikely(rq->bio)) blk_end_request(rq, -ENOMEM, blk_rq_bytes(rq)); else blk_put_request(rq); } Bart.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/15] nvme-lightnvm: use blk_execute_rq in nvme_nvm_submit_user_cmd
On Thu, 2017-05-18 at 15:17 +0200, Christoph Hellwig wrote: > Instead of reinventing it poorly. Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/15] block: merge blk_types.h into bio.h
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote: > We've cleaned up our headers sufficiently that we don't need this split > anymore. Hello Christoph, Request-based drivers need the structure definitions from and the type definitions from but do not need the definition of struct bio. Have you considered to remove #include from file include/linux/blkdev.h? Do you think that would help to reduce the kernel build time? Thanks, Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [RFC PATCH 0/2] Introduce blkdev_issue_flush_no_wait()
On Tue, 2017-05-16 at 17:39 +0800, Anand Jain wrote: > BTRFS wanted a block device flush function which does not wait for > its completion, so that the flush for the next device can be called > in the same thread. > > Here is a RFC patch to provide the function > 'blkdev_issue_flush_no_wait()', which is based on the current device > flush function 'blkdev_issue_flush()', however it uses submit_bio() > instead of submit_bio_wait(). > > This patch is for review comments, will send out a final patch based > on the comments received. Hello Anand, Since the block layer can reorder requests, I think using blkdev_issue_flush_no_wait() will only yield the intended result if the caller waits until the requests that have to be flushed have completed. Is that how you intend to use this function? Thanks, Bart.N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
[PATCH 2/3] block, dm-crypt, btrfs: Introduce bio_flags()
Introduce the bio_flags() macro. Ensure that the second argument of bio_set_op_attrs() only contains flags and no operation. This patch does not change any functionality. Signed-off-by: Bart Van Assche Cc: Mike Christie Cc: Chris Mason (maintainer:BTRFS FILE SYSTEM) Cc: Josef Bacik (maintainer:BTRFS FILE SYSTEM) Cc: Mike Snitzer Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Damien Le Moal --- drivers/md/dm-crypt.c | 2 +- fs/btrfs/inode.c | 5 +++-- include/linux/blk_types.h | 3 ++- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 8742957..0448e7e 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1136,7 +1136,7 @@ static void clone_init(struct dm_crypt_io *io, struct bio *clone) clone->bi_private = io; clone->bi_end_io = crypt_endio; clone->bi_bdev= cc->dev->bdev; - bio_set_op_attrs(clone, bio_op(io->base_bio), io->base_bio->bi_opf); + bio_set_op_attrs(clone, bio_op(io->base_bio), bio_flags(io->base_bio)); } static int kcryptd_io_read(struct dm_crypt_io *io, gfp_t gfp) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e6811c4..ca01106 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8412,7 +8412,7 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip, if (!bio) return -ENOMEM; - bio_set_op_attrs(bio, bio_op(orig_bio), orig_bio->bi_opf); + bio_set_op_attrs(bio, bio_op(orig_bio), bio_flags(orig_bio)); bio->bi_private = dip; bio->bi_end_io = btrfs_end_dio_bio; btrfs_io_bio(bio)->logical = file_offset; @@ -8450,7 +8450,8 @@ next_block: start_sector, GFP_NOFS); if (!bio) goto out_err; - bio_set_op_attrs(bio, bio_op(orig_bio), orig_bio->bi_opf); + bio_set_op_attrs(bio, bio_op(orig_bio), +bio_flags(orig_bio)); bio->bi_private = dip; bio->bi_end_io = btrfs_end_dio_bio; btrfs_io_bio(bio)->logical = file_offset; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 1e1ef21..311fa2f 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -90,11 +90,12 @@ struct bio { }; #define BIO_OP_SHIFT (8 * FIELD_SIZEOF(struct bio, bi_opf) - REQ_OP_BITS) +#define bio_flags(bio) ((bio)->bi_opf & ((1 << BIO_OP_SHIFT) - 1)) #define bio_op(bio)((bio)->bi_opf >> BIO_OP_SHIFT) #define bio_set_op_attrs(bio, op, op_flags) do { \ WARN_ON(op >= (1 << REQ_OP_BITS)); \ - (bio)->bi_opf &= ((1 << BIO_OP_SHIFT) - 1); \ + (bio)->bi_opf = bio_flags(bio); \ (bio)->bi_opf |= ((unsigned int) (op) << BIO_OP_SHIFT); \ (bio)->bi_opf |= op_flags; \ } while (0) -- 2.10.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] block: Document that bio_op() uses the data type of bio.bi_opf
Make it clear that the sizeof(unsigned int) expression in BIO_OP_SHIFT refers to the bi_opf member of struct bio. Signed-off-by: Bart Van Assche Cc: Mike Christie Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Damien Le Moal --- include/linux/blk_types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 436f43f..1e1ef21 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -89,7 +89,7 @@ struct bio { struct bio_vec bi_inline_vecs[0]; }; -#define BIO_OP_SHIFT (8 * sizeof(unsigned int) - REQ_OP_BITS) +#define BIO_OP_SHIFT (8 * FIELD_SIZEOF(struct bio, bi_opf) - REQ_OP_BITS) #define bio_op(bio)((bio)->bi_opf >> BIO_OP_SHIFT) #define bio_set_op_attrs(bio, op, op_flags) do { \ -- 2.10.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] block: Improve bio_set_op_attrs() robustness
Since REQ_OP_BITS == 3 and __REQ_NR_BITS == 30 it is not that hard to pass an op_flags argument to bio_set_op_attrs() that is larger than the number of bits reserved for the op_flags argument. Complain if this happens. Additionally, ensure that negative arguments trigger a complaint (1 << ... is signed while 1U << ... is unsigned; adding 0U to an integer expression causes it to be promoted to an unsigned type). Signed-off-by: Bart Van Assche Cc: Mike Christie Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Damien Le Moal --- include/linux/blk_types.h | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 311fa2f..53ee1a2 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -93,11 +93,18 @@ struct bio { #define bio_flags(bio) ((bio)->bi_opf & ((1 << BIO_OP_SHIFT) - 1)) #define bio_op(bio)((bio)->bi_opf >> BIO_OP_SHIFT) -#define bio_set_op_attrs(bio, op, op_flags) do { \ - WARN_ON(op >= (1 << REQ_OP_BITS)); \ - (bio)->bi_opf = bio_flags(bio); \ - (bio)->bi_opf |= ((unsigned int) (op) << BIO_OP_SHIFT); \ - (bio)->bi_opf |= op_flags; \ +#define bio_set_op_attrs(bio, op, op_flags) do { \ + if (__builtin_constant_p(op)) \ + BUILD_BUG_ON((op) + 0U >= (1U << REQ_OP_BITS)); \ + else\ + WARN_ON_ONCE((op) + 0U >= (1U << REQ_OP_BITS)); \ + if (__builtin_constant_p(op_flags)) \ + BUILD_BUG_ON((op_flags) + 0U >= (1U << BIO_OP_SHIFT)); \ + else\ + WARN_ON_ONCE((op_flags) + 0U >= (1U << BIO_OP_SHIFT)); \ + (bio)->bi_opf = bio_flags(bio); \ + (bio)->bi_opf |= (((op) + 0U) << BIO_OP_SHIFT); \ + (bio)->bi_opf |= (op_flags);\ } while (0) #define BIO_RESET_BYTESoffsetof(struct bio, bi_max_vecs) -- 2.10.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] block: Improve bio_set_op_attrs() robustness
Hi Jens, bio_set_op_attrs() does not yet check whether the "op_flags" field overflows into the "op" field. Since adding support for SMR requires to introduce more REQ_* flags I think it is important to have such a check. Hence this patch series. Please consider these patches for inclusion in the upstream Linux kernel. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [PATCH 01/35] block/fs/drivers: remove rw argument from submit_bio
On 01/05/2016 09:53 PM, mchri...@redhat.com wrote: From: Mike Christie This has callers of submit_bio/submit_bio_wait set the bio->bi_rw instead of passing it in. This makes that use the same as generic_make_request and how we set the other bio fields. Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html