Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec

2019-02-19 Thread Bart Van Assche

On 2/19/19 5:17 PM, Ming Lei wrote:

On Tue, Feb 19, 2019 at 08:28:19AM -0800, Bart Van Assche wrote:

With this patch applied test nvmeof-mp/002 fails as follows:

[  694.700400] kernel BUG at lib/sg_pool.c:103!
[  694.705932] invalid opcode:  [#1] PREEMPT SMP KASAN
[  694.708297] CPU: 2 PID: 349 Comm: kworker/2:1H Tainted: GB 
5.0.0-rc6-dbg+ #2
[  694.711730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[  694.715113] Workqueue: kblockd blk_mq_run_work_fn
[  694.716894] RIP: 0010:sg_alloc_table_chained+0xe5/0xf0
[  694.758222] Call Trace:
[  694.759645]  nvme_rdma_queue_rq+0x2aa/0xcc0 [nvme_rdma]
[  694.764915]  blk_mq_try_issue_directly+0x2a5/0x4b0
[  694.771779]  blk_insert_cloned_request+0x11e/0x1c0
[  694.778417]  dm_mq_queue_rq+0x3d1/0x770
[  694.793400]  blk_mq_dispatch_rq_list+0x5fc/0xb10
[  694.798386]  blk_mq_sched_dispatch_requests+0x2f7/0x300
[  694.803180]  __blk_mq_run_hw_queue+0xd6/0x180
[  694.808933]  blk_mq_run_work_fn+0x27/0x30
[  694.810315]  process_one_work+0x4f1/0xa40
[  694.813178]  worker_thread+0x67/0x5b0
[  694.814487]  kthread+0x1cf/0x1f0
[  694.819134]  ret_from_fork+0x24/0x30

The code in sg_pool.c that triggers the BUG() statement is as follows:

int sg_alloc_table_chained(struct sg_table *table, int nents,
struct scatterlist *first_chunk)
{
int ret;

BUG_ON(!nents);
[ ... ]

Bart.


I can reproduce this issue("kernel BUG at lib/sg_pool.c:103") without mp-bvec 
patches,
so looks it isn't the fault of this patchset.


Thanks Ming for your feedback.

Jens, I don't see that issue with kernel v5.0-rc6. Does that mean that 
the sg_pool BUG() is a regression in your for-next branch that predates 
Ming's multi-page bvec patch series?


Thanks,

Bart.


Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec

2019-02-19 Thread Bart Van Assche
On Sun, 2019-02-17 at 21:11 +0800, Ming Lei wrote:
> The following patch should fix this issue:
> 
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index bed065904677..066b66430523 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -363,13 +363,15 @@ static unsigned int __blk_recalc_rq_segments(struct 
> request_queue *q,
>   struct bio_vec bv, bvprv = { NULL };
>   int prev = 0;
>   unsigned int seg_size, nr_phys_segs;
> - unsigned front_seg_size = bio->bi_seg_front_size;
> + unsigned front_seg_size;
>   struct bio *fbio, *bbio;
>   struct bvec_iter iter;
>  
>   if (!bio)
>   return 0;
>  
> + front_seg_size = bio->bi_seg_front_size;
> +
>   switch (bio_op(bio)) {
>   case REQ_OP_DISCARD:
>   case REQ_OP_SECURE_ERASE:

Hi Ming,

With this patch applied test nvmeof-mp/002 fails as follows:

[  694.700400] kernel BUG at lib/sg_pool.c:103!
[  694.705932] invalid opcode:  [#1] PREEMPT SMP KASAN
[  694.708297] CPU: 2 PID: 349 Comm: kworker/2:1H Tainted: GB 
5.0.0-rc6-dbg+ #2
[  694.711730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[  694.715113] Workqueue: kblockd blk_mq_run_work_fn
[  694.716894] RIP: 0010:sg_alloc_table_chained+0xe5/0xf0
[  694.758222] Call Trace:
[  694.759645]  nvme_rdma_queue_rq+0x2aa/0xcc0 [nvme_rdma]
[  694.764915]  blk_mq_try_issue_directly+0x2a5/0x4b0
[  694.771779]  blk_insert_cloned_request+0x11e/0x1c0
[  694.778417]  dm_mq_queue_rq+0x3d1/0x770
[  694.793400]  blk_mq_dispatch_rq_list+0x5fc/0xb10
[  694.798386]  blk_mq_sched_dispatch_requests+0x2f7/0x300
[  694.803180]  __blk_mq_run_hw_queue+0xd6/0x180
[  694.808933]  blk_mq_run_work_fn+0x27/0x30
[  694.810315]  process_one_work+0x4f1/0xa40
[  694.813178]  worker_thread+0x67/0x5b0
[  694.814487]  kthread+0x1cf/0x1f0
[  694.819134]  ret_from_fork+0x24/0x30

The code in sg_pool.c that triggers the BUG() statement is as follows:

int sg_alloc_table_chained(struct sg_table *table, int nents,
struct scatterlist *first_chunk)
{
int ret;

BUG_ON(!nents);
[ ... ]

Bart.


Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec

2019-02-15 Thread Bart Van Assche
On Fri, 2019-02-15 at 08:49 -0700, Jens Axboe wrote:
> On 2/15/19 4:13 AM, Ming Lei wrote:
> > This patchset brings multi-page bvec into block layer:
> 
> Applied, thanks Ming. Let's hope it sticks!

Hi Jens and Ming,

Test nvmeof-mp/002 fails with Jens' for-next branch from this morning.
I have not yet tried to figure out which patch introduced the failure.
Anyway, this is what I see in the kernel log for test nvmeof-mp/002:

[  475.611363] BUG: unable to handle kernel NULL pointer dereference at 
0020
[  475.621188] #PF error: [normal kernel read fault]
[  475.623148] PGD 0 P4D 0  
[  475.624737] Oops:  [#1] PREEMPT SMP KASAN
[  475.626628] CPU: 1 PID: 277 Comm: kworker/1:1H Tainted: GB 
5.0.0-rc6-dbg+ #1
[  475.630232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[  475.633855] Workqueue: kblockd blk_mq_requeue_work
[  475.635777] RIP: 0010:__blk_recalc_rq_segments+0xbe/0x590
[  475.670948] Call Trace:
[  475.693515]  blk_recalc_rq_segments+0x2f/0x50
[  475.695081]  blk_insert_cloned_request+0xbb/0x1c0
[  475.701142]  dm_mq_queue_rq+0x3d1/0x770
[  475.707225]  blk_mq_dispatch_rq_list+0x5fc/0xb10
[  475.717137]  blk_mq_sched_dispatch_requests+0x256/0x300
[  475.721767]  __blk_mq_run_hw_queue+0xd6/0x180
[  475.725920]  __blk_mq_delay_run_hw_queue+0x25c/0x290
[  475.727480]  blk_mq_run_hw_queue+0x119/0x1b0
[  475.732019]  blk_mq_run_hw_queues+0x7b/0xa0
[  475.733468]  blk_mq_requeue_work+0x2cb/0x300
[  475.736473]  process_one_work+0x4f1/0xa40
[  475.739424]  worker_thread+0x67/0x5b0
[  475.741751]  kthread+0x1cf/0x1f0
[  475.746034]  ret_from_fork+0x24/0x30

(gdb) list *(__blk_recalc_rq_segments+0xbe)
0x816a152e is in __blk_recalc_rq_segments (block/blk-merge.c:366).
361  struct bio *bio)
362 {
363 struct bio_vec bv, bvprv = { NULL };
364 int prev = 0;
365 unsigned int seg_size, nr_phys_segs;
366 unsigned front_seg_size = bio->bi_seg_front_size;
367 struct bio *fbio, *bbio;
368 struct bvec_iter iter;
369
370 if (!bio)

Bart.


Re: [PATCH] btrfs: Fix a C compliance issue

2018-06-20 Thread Bart Van Assche
On Wed, 2018-06-20 at 13:19 -0400, Jeff Mahoney wrote:
> The shed should be yellow.
> 
> -Jeff
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 891cd2ed5dd4..57c9da0b459f 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -2375,21 +2375,20 @@ static __cold void btrfs_interface_exit(void)
> 
>  static void __init btrfs_print_mod_info(void)
>  {
> - pr_info("Btrfs loaded, crc32c=%s"
> + pr_info("Btrfs loaded, crc32c=%s", crc32c_impl());
>  #ifdef CONFIG_BTRFS_DEBUG
> - ", debug=on"
> + pr_cont(", debug=on");
>  #endif
>  #ifdef CONFIG_BTRFS_ASSERT
> - ", assert=on"
> + pr_cont(", assert=on");
>  #endif
>  #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
> - ", integrity-checker=on"
> + pr_cont(", integrity-checker=on");
>  #endif
>  #ifdef CONFIG_BTRFS_FS_REF_VERIFY
> - ", ref-verify=on"
> + pr_cont(", ref-verify=on")
>  #endif
> - "\n",
> - crc32c_impl());
> + pr_cont("\n");
>  }
> 
>  static int null_open(struct block_device *bdev, fmode_t mode)

Since we are doing bikeshedding, let me contribute to it :-)

From scripts/checkpatch.pl:

if ($line =~ /\bprintk\s*\(\s*KERN_CONT\b|\bpr_cont\s*\(/) {
WARN("LOGGING_CONTINUATION",
 "Avoid logging continuation uses where feasible\n" 
. $herecurr);
}

Bart.







[PATCH v2 0/3] Three patches that address static analyzer reports

2018-06-20 Thread Bart Van Assche
Hello Chris and Josef,

The three patches in this series address complaints reported by static
analyzers (gcc + W=1, sparse, smatch). These patches do not change any
functionality. Please consider these for inclusion in the upstream
kernel.

Thanks,

Bart.

Bart Van Assche (3):
  btrfs: Fix indentation
  btrfs: Annotate fall-through
  btrfs: Fix a C compliance issue

 fs/btrfs/extent-tree.c | 4 ++--
 fs/btrfs/ioctl.c   | 4 ++--
 fs/btrfs/reada.c   | 2 +-
 fs/btrfs/super.c   | 7 ---
 4 files changed, 9 insertions(+), 8 deletions(-)

-- 
2.17.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] btrfs: Annotate fall-through

2018-06-20 Thread Bart Van Assche
This patch avoids that the compiler complains that a fall-through
annotation is missing when building with W=1.

Signed-off-by: Bart Van Assche 
---
 fs/btrfs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 81107ad49f3a..3e298f26a383 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -760,6 +760,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char 
*options,
case Opt_recovery:
btrfs_warn(info,
   "'recovery' is deprecated, use 
'usebackuproot' instead");
+   /* fall through */
case Opt_usebackuproot:
btrfs_info(info,
   "trying to use backup root at mount time");
-- 
2.17.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] btrfs: Fix indentation

2018-06-20 Thread Bart Van Assche
This patch avoids that building the BTRFS source code with smatch
triggers complaints about inconsistent indenting.

Signed-off-by: Bart Van Assche 
---
 fs/btrfs/extent-tree.c | 4 ++--
 fs/btrfs/ioctl.c   | 4 ++--
 fs/btrfs/reada.c   | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3d9fe58c0080..db46ceb62b3f 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6279,7 +6279,7 @@ static int update_block_group(struct btrfs_trans_handle 
*trans,
if (list_empty(&cache->dirty_list)) {
list_add_tail(&cache->dirty_list,
  &trans->transaction->dirty_bgs);
-   trans->transaction->num_dirty_bgs++;
+   trans->transaction->num_dirty_bgs++;
btrfs_get_block_group(cache);
}
spin_unlock(&trans->transaction->dirty_bgs_lock);
@@ -7534,7 +7534,7 @@ static noinline int find_free_extent(struct btrfs_fs_info 
*fs_info,
 * for the proper type.
 */
if (!block_group_bits(block_group, flags)) {
-   u64 extra = BTRFS_BLOCK_GROUP_DUP |
+   u64 extra = BTRFS_BLOCK_GROUP_DUP |
BTRFS_BLOCK_GROUP_RAID1 |
BTRFS_BLOCK_GROUP_RAID5 |
BTRFS_BLOCK_GROUP_RAID6 |
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index c2837a32d689..93b23549ee1e 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2507,8 +2507,8 @@ static int btrfs_search_path_in_tree_user(struct inode 
*inode,
 static noinline int btrfs_ioctl_ino_lookup(struct file *file,
   void __user *argp)
 {
-struct btrfs_ioctl_ino_lookup_args *args;
-struct inode *inode;
+   struct btrfs_ioctl_ino_lookup_args *args;
+   struct inode *inode;
int ret = 0;
 
args = memdup_user(argp, sizeof(*args));
diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index 40f1bcef394d..4be425f70c2d 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -355,7 +355,7 @@ static struct reada_extent *reada_find_extent(struct 
btrfs_fs_info *fs_info,
dev = bbio->stripes[nzones].dev;
 
/* cannot read ahead on missing device. */
-if (!dev->bdev)
+   if (!dev->bdev)
continue;
 
zone = reada_find_zone(dev, logical, bbio);
-- 
2.17.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] btrfs: Fix a C compliance issue

2018-06-20 Thread Bart Van Assche
The C programming language does not allow to use preprocessor statements
inside macro arguments (pr_info() is defined as a macro). Hence rework
the pr_info() statement in btrfs_print_mod_info() such that it becomes
compliant. This patch allows tools like sparse to analyze the BTRFS
source code.

Fixes: 62e855771dac ("btrfs: convert printk(KERN_* to use pr_* calls")
Signed-off-by: Bart Van Assche 
Cc: Jeff Mahoney 
Cc: David Sterba 
Cc: Nikolay Borisov 
---
 fs/btrfs/super.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3e298f26a383..972d9fbd7e96 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2370,7 +2370,7 @@ static __cold void btrfs_interface_exit(void)
 
 static void __init btrfs_print_mod_info(void)
 {
-   pr_info("Btrfs loaded, crc32c=%s"
+   static const char options[] =
 #ifdef CONFIG_BTRFS_DEBUG
", debug=on"
 #endif
@@ -2383,8 +2383,8 @@ static void __init btrfs_print_mod_info(void)
 #ifdef CONFIG_BTRFS_FS_REF_VERIFY
", ref-verify=on"
 #endif
-   "\n",
-   crc32c_impl());
+   ;
+   pr_info("Btrfs loaded, crc32c=%s%s\n", crc32c_impl(), options);
 }
 
 static int __init init_btrfs_fs(void)
-- 
2.17.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Fix a C compliance issue

2018-06-20 Thread Bart Van Assche
On Mon, 2018-06-18 at 12:31 +0300, Nikolay Borisov wrote:
> On 18.06.2018 12:26, David Sterba wrote:
> > On Sat, Jun 16, 2018 at 01:28:13PM +0300, Nikolay Borisov wrote:
> > > I'd rather not see more printk being added. Nothing prevents from having
> > > the fmt string being passed to pr_info.
> > 
> > So you mean to do
> > 
> > +   static const char fmt[] = "Btrfs loaded, crc32c=%s"
> > +   pr_info(fmt);
> 
> Pretty much, something along the lines of
> 
> pr_info(fmt, crc32c_impl).
> 
> printk requires having the KERN_INFO in the format string, which I see
> no point in doing, correct me if I'm wrong?

You should know that what you proposed doesn't compile because pr_info()
relies on string concatenation and hence requires that its first argument is
a string constant instead of a const char pointer. Anyway, I will rework this
patch such that it uses pr_info() instead of printk().

Bart.




[PATCH] btrfs: Fix a C compliance issue

2018-06-15 Thread Bart Van Assche
The C programming language does not allow to use preprocessor statements
inside macro arguments (pr_info() is defined as a macro). Hence rework
the pr_info() statement in btrfs_print_mod_info() such that it becomes
compliant. This patch allows tools like sparse to analyze the BTRFS
source code.

Fixes: 62e855771dac ("btrfs: convert printk(KERN_* to use pr_* calls")
Signed-off-by: Bart Van Assche 
Cc: Jeff Mahoney 
Cc: David Sterba 
---
 fs/btrfs/super.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 81107ad49f3a..dd4980df5b8e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2369,7 +2369,7 @@ static __cold void btrfs_interface_exit(void)
 
 static void __init btrfs_print_mod_info(void)
 {
-   pr_info("Btrfs loaded, crc32c=%s"
+   static const char fmt[] = KERN_INFO "Btrfs loaded, crc32c=%s"
 #ifdef CONFIG_BTRFS_DEBUG
", debug=on"
 #endif
@@ -2382,8 +2382,8 @@ static void __init btrfs_print_mod_info(void)
 #ifdef CONFIG_BTRFS_FS_REF_VERIFY
", ref-verify=on"
 #endif
-   "\n",
-   crc32c_impl());
+   "\n";
+   printk(fmt, crc32c_impl());
 }
 
 static int __init init_btrfs_fs(void)
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 14:07 -0700, Jens Axboe wrote:
> You're really not making it easy for folks to run this :-)

My hope is that the ib_srp and ib_srpt patches will be accepted upstream soon.
As long as these are not upstream, anyone who wants to retrieve these patches
is welcome to clone 
https://github.com/bvanassche/linux/tree/block-scsi-for-next,
a kernel tree with all my pending patches.

> Do you have the matching blk-mq debugfs output for the device?

Sorry but I only retrieved the blk-mq debugfs several minutes after the hang
started so I'm not sure the state information is relevant. Anyway, I have 
attached
it to this e-mail. The most remarkable part is the following:

./9ddfa913/requeue_list:9646711c {.op=READ, .state=idle, gen=0x1
18, abort_gen=0x0, .cmd_flags=, .rq_flags=SORTED|1|SOFTBARRIER|IO_STAT, complete
=0, .tag=-1, .internal_tag=217}

The hexadecimal number at the start is the request_queue pointer (I modified the
blk-mq-debugfs code such that queues are registered with there address just 
after
creation and until a name is assigned). This is a dm-mpath queue.

Bart../9ddfa913/hctx0/cpu3/completed:29 0
./9ddfa913/hctx0/cpu3/merged:0
./9ddfa913/hctx0/cpu3/dispatched:30 0
./9ddfa913/hctx0/cpu2/completed:0 0
./9ddfa913/hctx0/cpu2/merged:0
./9ddfa913/hctx0/cpu2/dispatched:0 0
./9ddfa913/hctx0/cpu1/completed:0 0
./9ddfa913/hctx0/cpu1/merged:0
./9ddfa913/hctx0/cpu1/dispatched:0 0
./9ddfa913/hctx0/cpu0/completed:0 0
./9ddfa913/hctx0/cpu0/merged:0
./9ddfa913/hctx0/cpu0/dispatched:0 0
./9ddfa913/hctx0/active:0
./9ddfa913/hctx0/run:92
./9ddfa913/hctx0/queued:30
./9ddfa913/hctx0/dispatched:   00
./9ddfa913/hctx0/dispatched:   197
./9ddfa913/hctx0/dispatched:   20
./9ddfa913/hctx0/dispatched:   40
./9ddfa913/hctx0/dispatched:   80
./9ddfa913/hctx0/dispatched:  160
./9ddfa913/hctx0/dispatched:  32+   0
./9ddfa913/hctx0/io_poll:considered=0
./9ddfa913/hctx0/io_poll:invoked=0
./9ddfa913/hctx0/io_poll:success=0
./9ddfa913/hctx0/sched_tags_bitmap::      
  
./9ddfa913/hctx0/sched_tags_bitmap:0010:      
0002  
./9ddfa913/hctx0/sched_tags:nr_tags=256
./9ddfa913/hctx0/sched_tags:nr_reserved_tags=0
./9ddfa913/hctx0/sched_tags:active_queues=0
./9ddfa913/hctx0/sched_tags:
./9ddfa913/hctx0/sched_tags:bitmap_tags:
./9ddfa913/hctx0/sched_tags:depth=256
./9ddfa913/hctx0/sched_tags:busy=1
./9ddfa913/hctx0/sched_tags:bits_per_word=64
./9ddfa913/hctx0/sched_tags:map_nr=4
./9ddfa913/hctx0/sched_tags:alloc_hint={45, 56, 144, 218}
./9ddfa913/hctx0/sched_tags:wake_batch=8
./9ddfa913/hctx0/sched_tags:wake_index=0
./9ddfa913/hctx0/sched_tags:ws={
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:{.wait_cnt=8, .wait=inactive},
./9ddfa913/hctx0/sched_tags:}
./9ddfa913/hctx0/sched_tags:round_robin=0
./9ddfa913/hctx0/tags_bitmap::       
 
./9ddfa913/hctx0/tags_bitmap:0010:       
 
./9ddfa913/hctx0/tags_bitmap:0020:       
 
./9ddfa913/hctx0/tags_bitmap:0030:       
 
./9ddfa913/hctx0/tags_bitmap:0040:       
 
./9ddfa913/hctx0/tags_bitmap:0050:       
 
./9ddfa913/hctx0/tags_bitmap:0060:       
 
./9ddfa913/hctx0/tags_bitmap:0070:       
 
./9ddfa913/hctx0/tags_bitmap:0080:       
 
./9ddfa913/hctx0/tags_bitmap:0090:       
 
./9ddfa913/hctx0/tags_bitmap:00a0:       
 
./9ddfa913/hctx0/tags_bitmap:00b0:       
 
./9ddfa913/hctx0/tags_bitmap:00c0:       
 
./9ddfa913/hctx0/tags_bitmap:00d0:       
 
./9ddfa913/hctx0/tags_bitmap:00e0:      

Re: [PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-12 Thread Bart Van Assche
On Tue, 2018-01-09 at 08:29 -0800, Tejun Heo wrote:
> Currently, blk-mq timeout path synchronizes against the usual
> issue/completion path using a complex scheme involving atomic
> bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
> rules.  Unfortunatley, it contains quite a few holes.

Hello Tejun,

With this patch series applied I see weird hangs in blk_mq_get_tag() when I
run the srp-test software. If I pull Jens' latest for-next branch and revert
this patch series then the srp-test software runs successfully. Note: if you
don't have InfiniBand hardware available then you will need the RDMA/CM
patches for the SRP initiator and target drivers that have been posted
recently on the linux-rdma mailing list to run the srp-test software.

This is how I run the srp-test software in a VM:

./run_tests -c -d -r 10

Here is an example of what SysRq-w reported when the hang occurred:

sysrq: SysRq : Show Blocked State
 taskPC stack   pid father
kworker/u8:0D12864 5  2 0x8000
Workqueue: events_unbound sd_probe_async [sd_mod]
Call Trace:
? __schedule+0x2b4/0xbb0
schedule+0x2d/0x90
io_schedule+0xd/0x30
blk_mq_get_tag+0x169/0x290
? finish_wait+0x80/0x80
blk_mq_get_request+0x16a/0x4f0
blk_mq_alloc_request+0x59/0xc0
blk_get_request_flags+0x3f/0x260
scsi_execute+0x33/0x1e0 [scsi_mod]
read_capacity_16.part.35+0x9c/0x460 [sd_mod]
sd_revalidate_disk+0x14bb/0x1cb0 [sd_mod]
sd_probe_async+0xf2/0x1a0 [sd_mod]
process_one_work+0x21c/0x6d0
worker_thread+0x35/0x380
? process_one_work+0x6d0/0x6d0
kthread+0x117/0x130
? kthread_create_worker_on_cpu+0x40/0x40
ret_from_fork+0x24/0x30
systemd-udevd   D13672  1048285 0x0100
Call Trace:
? __schedule+0x2b4/0xbb0
schedule+0x2d/0x90
io_schedule+0xd/0x30
generic_file_read_iter+0x32f/0x970
? page_cache_tree_insert+0x100/0x100
__vfs_read+0xcc/0x120
vfs_read+0x96/0x140
SyS_read+0x40/0xa0
do_syscall_64+0x5f/0x1b0
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f8ce6d08d11
RSP: 002b:7fff96dec288 EFLAGS: 0246 ORIG_RAX: 
RAX: ffda RBX: 5651de7f6e10 RCX: 7f8ce6d08d11
RDX: 0040 RSI: 5651de7f6e38 RDI: 0007
RBP: 5651de7ea500 R08: 7f8ce6cf1c20 R09: 5651de7f6e10
R10: 006f R11: 0246 R12: 01ff
R13: 01ff0040 R14: 5651de7ea550 R15: 0040
systemd-udevd   D13496  1049285 0x0100
Call Trace:
? __schedule+0x2b4/0xbb0
schedule+0x2d/0x90
io_schedule+0xd/0x30
blk_mq_get_tag+0x169/0x290
? finish_wait+0x80/0x80
blk_mq_get_request+0x16a/0x4f0
blk_mq_make_request+0x105/0x8e0
? generic_make_request+0xd6/0x3d0
generic_make_request+0x103/0x3d0
? submit_bio+0x57/0x110
submit_bio+0x57/0x110
mpage_readpages+0x13b/0x160
? I_BDEV+0x10/0x10
? rcu_read_lock_sched_held+0x66/0x70
? __alloc_pages_nodemask+0x2e8/0x360
__do_page_cache_readahead+0x2a4/0x370
? force_page_cache_readahead+0xaf/0x110
force_page_cache_readahead+0xaf/0x110
generic_file_read_iter+0x743/0x970
? find_held_lock+0x2d/0x90
? _raw_spin_unlock+0x29/0x40
__vfs_read+0xcc/0x120
vfs_read+0x96/0x140
SyS_read+0x40/0xa0
do_syscall_64+0x5f/0x1b0
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f8ce6d08d11
RSP: 002b:7fff96dec8b8 EFLAGS: 0246 ORIG_RAX: 
RAX: ffda RBX: 7f8ce7085010 RCX: 7f8ce6d08d11
RDX: 0004 RSI: 7f8ce7085038 RDI: 000f
RBP: 5651de7ec840 R08:  R09: 7f8ce7085010
R10: 7f8ce7085028 R11: 0246 R12: 
R13: 0004 R14: 5651de7ec890 R15: 0004
systemd-udevd   D13672  1055285 0x0100
Call Trace:
? __schedule+0x2b4/0xbb0
schedule+0x2d/0x90
io_schedule+0xd/0x30
blk_mq_get_tag+0x169/0x290
? finish_wait+0x80/0x80
blk_mq_get_request+0x16a/0x4f0
blk_mq_make_request+0x105/0x8e0
? generic_make_request+0xd6/0x3d0
generic_make_request+0x103/0x3d0
? submit_bio+0x57/0x110
submit_bio+0x57/0x110
mpage_readpages+0x13b/0x160
? I_BDEV+0x10/0x10
? rcu_read_lock_sched_held+0x66/0x70
? __alloc_pages_nodemask+0x2e8/0x360
__do_page_cache_readahead+0x2a4/0x370
? force_page_cache_readahead+0xaf/0x110
force_page_cache_readahead+0xaf/0x110
generic_file_read_iter+0x743/0x970
__vfs_read+0xcc/0x120
vfs_read+0x96/0x140
SyS_read+0x40/0xa0
do_syscall_64+0x5f/0x1b0
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f8ce6d08d11
RSP: 002b:7fff96dec848 EFLAGS: 0246 ORIG_RAX: 
RAX: ffda RBX: 5651de7ec300 RCX: 7f8ce6d08d11
RDX: 0100 RSI: 5651de7ec328 RDI: 000f
RBP: 5651de7ea500 R08:  R09: 5651de7ec300
R10: 5651de7ec318 R11: 0246 R12: 01ffe000
R13: 01ffe100 R14: 5651de7ea550 R15: 0100

Please let me know if you need more information.

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 2/8] blk-mq: protect completion path with RCU

2018-01-09 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> Currently, blk-mq protects only the issue path with RCU.  This patch
> puts the completion path under the same RCU protection.  This will be
> used to synchronize issue/completion against timeout by later patches,
> which will also add the comments.
> 
> Signed-off-by: Tejun Heo 
> ---
>  block/blk-mq.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index ddc9261..6741c3e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -584,11 +584,16 @@ static void hctx_lock(struct blk_mq_hw_ctx *hctx, int 
> *srcu_idx)
>  void blk_mq_complete_request(struct request *rq)
>  {
>   struct request_queue *q = rq->q;
> + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
> + int srcu_idx;
>  
>   if (unlikely(blk_should_fake_timeout(q)))
>   return;
> +
> + hctx_lock(hctx, &srcu_idx);
>   if (!blk_mark_rq_complete(rq))
>   __blk_mq_complete_request(rq);
> + hctx_unlock(hctx, srcu_idx);
>  }
>  EXPORT_SYMBOL(blk_mq_complete_request);

Hello Tejun,

I'm concerned about the additional CPU cycles needed for the new 
blk_mq_map_queue()
call, although I know this call is cheap. Would the timeout code really get that
much more complicated if the hctx_lock() and hctx_unlock() calls would be 
changed
into rcu_read_lock() and rcu_read_unlock() calls? Would it be sufficient if
"if (has_rcu) synchronize_rcu();" would be changed into "synchronize_rcu();" in
blk_mq_timeout_work()?

Thanks,

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> The RCU protection has been expanded to cover both queueing and
> completion paths making ->queue_rq_srcu a misnomer.  Rename it to
> ->srcu as suggested by Bart.

Reviewed-by: Bart Van Assche 



Re: [PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> After the recent updates to use generation number and state based
> synchronization, we can easily replace REQ_ATOM_STARTED usages by
> adding an extra state to distinguish completed but not yet freed
> state.
> 
> Add MQ_RQ_COMPLETE and replace REQ_ATOM_STARTED usages with
> blk_mq_rq_state() tests.  REQ_ATOM_STARTED no longer has any users
> left and is removed.

Reviewed-by: Bart Van Assche 

N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> After the recent updates to use generation number and state based
> synchronization, blk-mq no longer depends on REQ_ATOM_COMPLETE except
> to avoid firing the same timeout multiple times.
> 
> Remove all REQ_ATOM_COMPLETE usages and use a new rq_flags flag
> RQF_MQ_TIMEOUT_EXPIRED to avoid firing the same timeout multiple
> times.  This removes atomic bitops from hot paths too.
> 
> v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out().
> 
> v3: Added RQF_MQ_TIMEOUT_EXPIRED flag.

I think it's unfortunate that this patch spreads the request state over two
struct request members, namely the lower two bits of gstate and the
RQF_MQ_TIMEOUT_EXPIRED flag in req->rq_flags. Please consider to drop the
RQF_MQ_TIMEOUT_EXPIRED flag and to represent the "timeout has been processed"
state as a MQ_RQ_* state in gstate. That will avoid that every state update
has to be reviewed whether or not perhaps an update of the
RQF_MQ_TIMEOUT_EXPIRED flag is missing.

Thanks,

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> @@ -230,6 +232,27 @@ struct request {
>  
>   unsigned short write_hint;
>  
> + /*
> +  * On blk-mq, the lower bits of ->gstate carry the MQ_RQ_* state
> +  * value and the upper bits the generation number which is
> +  * monotonically incremented and used to distinguish the reuse
> +  * instances.
> +  *
> +  * ->gstate_seq allows updates to ->gstate and other fields
> +  * (currently ->deadline) during request start to be read
> +  * atomically from the timeout path, so that it can operate on a
> +  * coherent set of information.
> +  */
> + seqcount_t gstate_seq;
> + u64 gstate;
> +
> + /*
> +  * ->aborted_gstate is used by the timeout to claim a specific
> +  * recycle instance of this request.  See blk_mq_timeout_work().
> +  */
> + struct u64_stats_sync aborted_gstate_sync;
> + u64 aborted_gstate;
> +
>   unsigned long deadline;
>   struct list_head timeout_list;

Does "gstate" perhaps stand for "generation number and state"? If so, please
mention this in one of the above comments.

Thanks,

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> @@ -156,12 +156,12 @@ void blk_timeout_work(struct work_struct *work)
>   */
>  void blk_abort_request(struct request *req)
>  {
> - if (blk_mark_rq_complete(req))
> - return;
> -
>   if (req->q->mq_ops) {
> - blk_mq_rq_timed_out(req, false);
> + req->deadline = jiffies;
> + mod_timer(&req->q->timeout, 0);
>   } else {
> + if (blk_mark_rq_complete(req))
> + return;
>   blk_delete_timer(req);
>   blk_rq_timed_out(req);
>   }

Other req->deadline writes are protected by preempt_disable(),
write_seqcount_begin(&rq->gstate_seq), write_seqcount_end(&rq->gstate_seq)
and preempt_enable(). I think it's fine that the above req->deadline store
does not have that protection but I also think that that deserves a comment.

Thanks,

Bart.

Re: [PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> blk_mq_check_inflight() and blk_mq_poll_hybrid_sleep() test
> REQ_ATOM_COMPLETE to determine the request state.  Both uses are
> speculative and we can test REQ_ATOM_STARTED and blk_mq_rq_state() for
> equivalent results.  Replace the tests.  This will allow removing
> REQ_ATOM_COMPLETE usages from blk-mq.

Reviewed-by: Bart Van Assche 



Re: [PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> +static void blk_mq_rq_update_aborted_gstate(struct request *rq, u64 gstate)
> +{
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + u64_stats_update_begin(&rq->aborted_gstate_sync);
> + rq->aborted_gstate = gstate;
> + u64_stats_update_end(&rq->aborted_gstate_sync);
> + local_irq_restore(flags);
> +}

Please add a comment that explains the purpose of local_irq_save() and
local_irq_restore(). Please also explain why you chose to disable interrupts
instead of disabling preemption. I think that disabling preemption should be
sufficient since this is the only code that updates rq->aborted_gstate and
since this function is always called from thread context.

> @@ -801,6 +840,12 @@ void blk_mq_rq_timed_out(struct request *req, bool 
> reserved)
>   __blk_mq_complete_request(req);
>   break;
>   case BLK_EH_RESET_TIMER:
> + /*
> +  * As nothing prevents from completion happening while
> +  * ->aborted_gstate is set, this may lead to ignored
> +  * completions and further spurious timeouts.
> +  */
> + blk_mq_rq_update_aborted_gstate(req, 0);
>   blk_add_timer(req);
>   blk_clear_rq_complete(req);
>   break;

Is the race that the comment refers to addressed by one of the later patches?

Thanks,

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 1/8] blk-mq: move hctx lock/unlock into a helper

2018-01-08 Thread Bart Van Assche
On Mon, 2018-01-08 at 11:15 -0800, Tejun Heo wrote:
> +static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx)
> +{
> + if (!(hctx->flags & BLK_MQ_F_BLOCKING))
> + rcu_read_unlock();
> + else
> + srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
> +}
> +
> +static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx)
> +{
> + if (!(hctx->flags & BLK_MQ_F_BLOCKING))
> + rcu_read_lock();
> + else
> + *srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
> +}

A minor comment: please consider to reorder these two functions such that the
lock function appears first and the unlock function second. Anyway:

Reviewed-by: Bart Van Assche 



fs_reclaim() deadlock complaint

2017-10-12 Thread Bart Van Assche
Hello,

Since I started testing kernel v4.14-rc1 I see a deadlock complaint appearing
in the kernel log during boot. Is this a known issue?

$ cat /etc/fstab
#
/dev/mapper/vg-root /   btrfs   defaults,subvol=@ 0   1
/dev/mapper/vg-boot /boot   ext4defaults0   2
/dev/mapper/vg-root /home   btrfs   defaults,subvol=@home 0   2
UUID=5217d83f-213e-4b42-b86e-20013325ba6c none  swapsw  0   0

From the system log:

systemd[1]: Started Session c5 of user root.
kernel: 
kernel: 
kernel: WARNING: possible recursive locking detected
kernel: 4.14.0-rc3-dbg+ #11 Not tainted
kernel: 
kernel: dpkg/10251 is trying to acquire lock:
kernel: (fs_reclaim){+.+.}, at: [] 
fs_reclaim_acquire.part.85+0x0/0x30
kernel: 
kernel: but task is already holding lock:
kernel: (fs_reclaim){+.+.}, at: [] 
fs_reclaim_acquire.part.85+0x0/0x30
kernel: 
kernel: other info that might help us debug this:
kernel: Possible unsafe locking scenario:
kernel: 
kernel:   CPU0
kernel:   
kernel:  lock(fs_reclaim);
kernel:  lock(fs_reclaim);
kernel: 
kernel: *** DEADLOCK ***
kernel: 
kernel: May be due to missing lock nesting notation
kernel: 
kernel: 2 locks held by dpkg/10251:
kernel: #0:  (&mm->mmap_sem){}, at: [] 
__do_page_fault+0x11f/0x460
kernel: #1:  (fs_reclaim){+.+.}, at: [] 
fs_reclaim_acquire.part.85+0x0/0x30
kernel: 
kernel: stack backtrace:
kernel: CPU: 1 PID: 10251 Comm: dpkg Not tainted 4.14.0-rc3-dbg+ #11
kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.0.0-prebuilt.qemu-project.org 04/01/2014
kernel: Call Trace:
kernel: dump_stack+0x86/0xcf
kernel: __lock_acquire+0x720/0x1440
kernel: ? unwind_get_return_address+0x1a/0x30
kernel: ? __bfs+0x12e/0x210
kernel: lock_acquire+0xf5/0x200
kernel: ? lock_acquire+0xf5/0x200
kernel: ? pageset_set_high_and_batch+0x90/0x90
kernel: ? alloc_extent_state+0x1f/0x1a0 [btrfs]
kernel: fs_reclaim_acquire.part.85+0x24/0x30
kernel: ? pageset_set_high_and_batch+0x90/0x90
kernel: fs_reclaim_acquire+0x14/0x20
kernel: kmem_cache_alloc+0x2a/0x2c0
kernel: ? lock_acquire+0xf5/0x200
kernel: alloc_extent_state+0x1f/0x1a0 [btrfs]
kernel: __clear_extent_bit+0x26f/0x3f0 [btrfs]
kernel: ? test_range_bit+0xd1/0x100 [btrfs]
kernel: try_release_extent_mapping+0x192/0x1e0 [btrfs]
kernel: __btrfs_releasepage+0x2f/0x70 [btrfs]
kernel: ? page_get_anon_vma+0x170/0x170
kernel: btrfs_releasepage+0x3c/0x40 [btrfs]
kernel: try_to_release_page+0x4e/0x60
kernel: shrink_page_list+0xbc8/0x1010
kernel: ? trace_hardirqs_on_caller+0xf4/0x190
kernel: shrink_inactive_list+0x1a9/0x520
kernel: shrink_node_memcg.constprop.80+0x4ae/0x750
kernel: ? rcu_read_lock_sched_held+0x45/0x80
kernel: shrink_node+0x45/0x1a0
kernel: ? shrink_node+0x45/0x1a0
kernel: do_try_to_free_pages+0xd1/0x2c0
kernel: try_to_free_pages+0xed/0x330
kernel: ? pageset_set_high_and_batch+0x90/0x90
kernel: __alloc_pages_slowpath+0x454/0x1220
kernel: __alloc_pages_nodemask+0x273/0x300
kernel: mm_get_huge_zero_page+0x7f/0xf0
kernel: do_huge_pmd_anonymous_page+0x301/0x500
kernel: __handle_mm_fault+0xb42/0xd70
kernel: handle_mm_fault+0x8d/0x100
kernel: __do_page_fault+0x23f/0x460
kernel: ? trace_hardirqs_on_caller+0xf4/0x190
kernel: do_page_fault+0x2e/0x280
kernel: do_async_page_fault+0x14/0x60
kernel: async_page_fault+0x22/0x30
kernel: RIP: 0033:0x55704b9c5263
kernel: RSP: 002b:7ffc8326c4a0 EFLAGS: 00010202
kernel: RAX: 55704c390f10 RBX:  RCX: 7f8ec8985b00
kernel: RDX: 55704c390f10 RSI: 55704c390f40 RDI: 
kernel: RBP: 55704b9d2350 R08: 7f8ec8987760 R09: 0040
kernel: R10: 7f8ec8985b58 R11: 0202 R12: 55704b9ab410
kernel: R13: 7ffc8326c5f0 R14:  R15: 

Thanks,

Bart.

Re: [PATCH 06/15] fs: simplify dio_bio_complete

2017-05-24 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> Only read bio->bi_error once in the common path.

Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] fs: remove the unused error argument to dio_end_io()

2017-05-24 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/15] dm: fix REQ_RAHEAD handling

2017-05-24 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> A few (but not all) dm targets use a special EWOULDBLOCK error code for
> failing REQ_RAHEAD requests that fail due to a lack of available resources.
> But no one else knows about this magic code, and lower level drivers also
> don't generate it when failing read-ahead requests for similar reasons.
> 
> So remove this special casing and ignore all additional error handling for
> REQ_RAHEAD - if this was a real underlying error we'd get a normal read
> once the real read comes in.

Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/15] gfs2: remove the unused sd_log_error field

2017-05-24 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] scsi/osd: don't save block errors into req_results

2017-05-24 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:17 +0200, Christoph Hellwig wrote:
> We will only have sense data if the command exectured and got a SCSI
> result, so this is pointless.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/osd/osd_initiator.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/osd/osd_initiator.c 
> b/drivers/scsi/osd/osd_initiator.c
> index 8a1b94816419..14785177ce7b 100644
> --- a/drivers/scsi/osd/osd_initiator.c
> +++ b/drivers/scsi/osd/osd_initiator.c
> @@ -477,7 +477,7 @@ static void _set_error_resid(struct osd_request *or, 
> struct request *req,
>int error)
>  {
>   or->async_error = error;
> - or->req_errors = scsi_req(req)->result ? : error;
> + or->req_errors = scsi_req(req)->result;
>   or->sense_len = scsi_req(req)->sense_len;
>   if (or->sense_len)
>   memcpy(or->sense, scsi_req(req)->sense, or->sense_len);

Hello Christoph,

Are you sure that that code is not necessary? From osd_initiator.c:

static void _put_request(struct request *rq)
{
/*
 * If osd_finalize_request() was called but the request was not
 * executed through the block layer, then we must release BIOs.
 * TODO: Keep error code in or->async_error. Need to audit all
 *   code paths.
 */
if (unlikely(rq->bio))
blk_end_request(rq, -ENOMEM, blk_rq_bytes(rq));
else
blk_put_request(rq);
}

Bart.--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/15] nvme-lightnvm: use blk_execute_rq in nvme_nvm_submit_user_cmd

2017-05-24 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:17 +0200, Christoph Hellwig wrote:
> Instead of reinventing it poorly.

Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/15] block: merge blk_types.h into bio.h

2017-05-18 Thread Bart Van Assche
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> We've cleaned up our headers sufficiently that we don't need this split
> anymore.

Hello Christoph,

Request-based drivers need the structure definitions from  and
the type definitions from  but do not need the definition of
struct bio. Have you considered to remove #include  from file
include/linux/blkdev.h? Do you think that would help to reduce the kernel build
time?

Thanks,

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [RFC PATCH 0/2] Introduce blkdev_issue_flush_no_wait()

2017-05-16 Thread Bart Van Assche
On Tue, 2017-05-16 at 17:39 +0800, Anand Jain wrote:
> BTRFS wanted a block device flush function which does not wait for
> its completion, so that the flush for the next device can be called
> in the same thread.
> 
> Here is a RFC patch to provide the function
> 'blkdev_issue_flush_no_wait()', which is based on the current device
> flush function 'blkdev_issue_flush()', however it uses submit_bio()
> instead of submit_bio_wait().
> 
> This patch is for review comments, will send out a final patch based
> on the comments received.

Hello Anand,

Since the block layer can reorder requests, I think using
blkdev_issue_flush_no_wait() will only yield the intended result if
the caller waits until the requests that have to be flushed have completed.
Is that how you intend to use this function?

Thanks,

Bart.N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

[PATCH 2/3] block, dm-crypt, btrfs: Introduce bio_flags()

2016-09-14 Thread Bart Van Assche
Introduce the bio_flags() macro. Ensure that the second argument of
bio_set_op_attrs() only contains flags and no operation. This patch
does not change any functionality.

Signed-off-by: Bart Van Assche 
Cc: Mike Christie 
Cc: Chris Mason  (maintainer:BTRFS FILE SYSTEM)
Cc: Josef Bacik  (maintainer:BTRFS FILE SYSTEM)
Cc: Mike Snitzer 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Damien Le Moal 
---
 drivers/md/dm-crypt.c | 2 +-
 fs/btrfs/inode.c  | 5 +++--
 include/linux/blk_types.h | 3 ++-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 8742957..0448e7e 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1136,7 +1136,7 @@ static void clone_init(struct dm_crypt_io *io, struct bio 
*clone)
clone->bi_private = io;
clone->bi_end_io  = crypt_endio;
clone->bi_bdev= cc->dev->bdev;
-   bio_set_op_attrs(clone, bio_op(io->base_bio), io->base_bio->bi_opf);
+   bio_set_op_attrs(clone, bio_op(io->base_bio), bio_flags(io->base_bio));
 }
 
 static int kcryptd_io_read(struct dm_crypt_io *io, gfp_t gfp)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e6811c4..ca01106 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8412,7 +8412,7 @@ static int btrfs_submit_direct_hook(struct 
btrfs_dio_private *dip,
if (!bio)
return -ENOMEM;
 
-   bio_set_op_attrs(bio, bio_op(orig_bio), orig_bio->bi_opf);
+   bio_set_op_attrs(bio, bio_op(orig_bio), bio_flags(orig_bio));
bio->bi_private = dip;
bio->bi_end_io = btrfs_end_dio_bio;
btrfs_io_bio(bio)->logical = file_offset;
@@ -8450,7 +8450,8 @@ next_block:
  start_sector, GFP_NOFS);
if (!bio)
goto out_err;
-   bio_set_op_attrs(bio, bio_op(orig_bio), 
orig_bio->bi_opf);
+   bio_set_op_attrs(bio, bio_op(orig_bio),
+bio_flags(orig_bio));
bio->bi_private = dip;
bio->bi_end_io = btrfs_end_dio_bio;
btrfs_io_bio(bio)->logical = file_offset;
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 1e1ef21..311fa2f 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -90,11 +90,12 @@ struct bio {
 };
 
 #define BIO_OP_SHIFT   (8 * FIELD_SIZEOF(struct bio, bi_opf) - REQ_OP_BITS)
+#define bio_flags(bio) ((bio)->bi_opf & ((1 << BIO_OP_SHIFT) - 1))
 #define bio_op(bio)((bio)->bi_opf >> BIO_OP_SHIFT)
 
 #define bio_set_op_attrs(bio, op, op_flags) do {   \
WARN_ON(op >= (1 << REQ_OP_BITS));  \
-   (bio)->bi_opf &= ((1 << BIO_OP_SHIFT) - 1); \
+   (bio)->bi_opf = bio_flags(bio); \
(bio)->bi_opf |= ((unsigned int) (op) << BIO_OP_SHIFT); \
(bio)->bi_opf |= op_flags;  \
 } while (0)
-- 
2.10.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] block: Document that bio_op() uses the data type of bio.bi_opf

2016-09-14 Thread Bart Van Assche
Make it clear that the sizeof(unsigned int) expression in BIO_OP_SHIFT
refers to the bi_opf member of struct bio.

Signed-off-by: Bart Van Assche 
Cc: Mike Christie 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Damien Le Moal 
---
 include/linux/blk_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 436f43f..1e1ef21 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -89,7 +89,7 @@ struct bio {
struct bio_vec  bi_inline_vecs[0];
 };
 
-#define BIO_OP_SHIFT   (8 * sizeof(unsigned int) - REQ_OP_BITS)
+#define BIO_OP_SHIFT   (8 * FIELD_SIZEOF(struct bio, bi_opf) - REQ_OP_BITS)
 #define bio_op(bio)((bio)->bi_opf >> BIO_OP_SHIFT)
 
 #define bio_set_op_attrs(bio, op, op_flags) do {   \
-- 
2.10.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] block: Improve bio_set_op_attrs() robustness

2016-09-14 Thread Bart Van Assche
Since REQ_OP_BITS == 3 and __REQ_NR_BITS == 30 it is not that hard
to pass an op_flags argument to bio_set_op_attrs() that is larger
than the number of bits reserved for the op_flags argument. Complain
if this happens. Additionally, ensure that negative arguments trigger
a complaint (1 << ... is signed while 1U << ... is unsigned; adding
0U to an integer expression causes it to be promoted to an unsigned
type).

Signed-off-by: Bart Van Assche 
Cc: Mike Christie 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Damien Le Moal 
---
 include/linux/blk_types.h | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 311fa2f..53ee1a2 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -93,11 +93,18 @@ struct bio {
 #define bio_flags(bio) ((bio)->bi_opf & ((1 << BIO_OP_SHIFT) - 1))
 #define bio_op(bio)((bio)->bi_opf >> BIO_OP_SHIFT)
 
-#define bio_set_op_attrs(bio, op, op_flags) do {   \
-   WARN_ON(op >= (1 << REQ_OP_BITS));  \
-   (bio)->bi_opf = bio_flags(bio); \
-   (bio)->bi_opf |= ((unsigned int) (op) << BIO_OP_SHIFT); \
-   (bio)->bi_opf |= op_flags;  \
+#define bio_set_op_attrs(bio, op, op_flags) do {   \
+   if (__builtin_constant_p(op))   \
+   BUILD_BUG_ON((op) + 0U >= (1U << REQ_OP_BITS)); \
+   else\
+   WARN_ON_ONCE((op) + 0U >= (1U << REQ_OP_BITS)); \
+   if (__builtin_constant_p(op_flags)) \
+   BUILD_BUG_ON((op_flags) + 0U >= (1U << BIO_OP_SHIFT));  \
+   else\
+   WARN_ON_ONCE((op_flags) + 0U >= (1U << BIO_OP_SHIFT));  \
+   (bio)->bi_opf = bio_flags(bio); \
+   (bio)->bi_opf |= (((op) + 0U) << BIO_OP_SHIFT); \
+   (bio)->bi_opf |= (op_flags);\
 } while (0)
 
 #define BIO_RESET_BYTESoffsetof(struct bio, bi_max_vecs)
-- 
2.10.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] block: Improve bio_set_op_attrs() robustness

2016-09-14 Thread Bart Van Assche

Hi Jens,

bio_set_op_attrs() does not yet check whether the "op_flags" field 
overflows into the "op" field. Since adding support for SMR requires to 
introduce more REQ_* flags I think it is important to have such a check. 
Hence this patch series. Please consider these patches for inclusion in 
the upstream Linux kernel.


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [dm-devel] [PATCH 01/35] block/fs/drivers: remove rw argument from submit_bio

2016-01-06 Thread Bart Van Assche

On 01/05/2016 09:53 PM, mchri...@redhat.com wrote:

From: Mike Christie 

This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.


Reviewed-by: Bart Van Assche 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html