Re: [PATCH v2 2/2] block: Fix a race between the throttling code and request queue initialization

2018-02-02 Thread Bart Van Assche
On Fri, 2018-02-02 at 09:02 +0800, Joseph Qi wrote: > We triggered this race when using single queue. I'm not sure if it > exists in multi-queue. Regarding the races between modifying the queue_lock pointer and the code that uses that pointer, I think the following construct in

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-02 Thread Bart Van Assche
On Fri, 2018-02-02 at 15:08 +0100, Roman Pen wrote: > Since the first version the following was changed: > >- Load-balancing and IO fail-over using multipath features were added. >- Major parts of the code were rewritten, simplified and overall code > size was reduced by a quarter.

Re: [PATCH 23/24] ibnbd: a bit of documentation

2018-02-02 Thread Bart Van Assche
On Fri, 2018-02-02 at 15:09 +0100, Roman Pen wrote: > +Entries under /sys/kernel/ibnbd_client/ > +=== > [ ... ] You will need Greg KH's permission to add new entries directly under /sys/kernel. Since I think that it is unlikely that he will give that

[PATCH] blk-mq: Fix a race between resetting the timer and completion handling

2018-02-01 Thread Bart Van Assche
worker_thread+0x35/0x380 kthread+0x117/0x130 ret_from_fork+0x24/0x30 Fixes: 1d9bd5161ba3 ("blk-mq: replace timeout synchronization with a RCU and generation based scheme") Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Tejun Heo <t...@kernel.org> --- block/blk-core.c

Re: [LSF/MM TOPIC] KPTI effect on IO performance

2018-02-01 Thread Bart Van Assche
On 01/31/18 00:23, Ming Lei wrote: After KPTI is merged, there is extra load introduced to context switch between user space and kernel space. It is observed on my laptop that one syscall takes extra ~0.15us[1] compared with 'nopti'. IO performance is affected too, it is observed that IOPS

Re: [PATCH v2 2/2] block: Fix a race between the throttling code and request queue initialization

2018-02-01 Thread Bart Van Assche
On Thu, 2018-02-01 at 09:53 +0800, Joseph Qi wrote: > I'm afraid the risk may also exist in blk_cleanup_queue, which will > set queue_lock to to the default internal lock. > > spin_lock_irq(lock); > if (q->queue_lock != >__queue_lock) > q->queue_lock = >__queue_lock; >

[PATCH v2 0/2] block: Fix a race between the throttling code and request queue initialization

2018-01-31 Thread Bart Van Assche
. - Dropped blk_alloc_queue_node2() and modified all block drivers that call blk_alloc_queue_node(). Bart Van Assche (2): block: Add a third argument to blk_alloc_queue_node() block: Fix a race between the throttling code and request queue initialization block/blk-core.c | 29

[PATCH v2 2/2] block: Fix a race between the throttling code and request queue initialization

2018-01-31 Thread Bart Van Assche
() call in the driver into a blk_alloc_queue_node() call and remove the explicit .queue_lock initialization. Additionally, initialize the spin lock that will be used as queue lock earlier if necessary. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@

[PATCH v2 1/2] block: Add a third argument to blk_alloc_queue_node()

2018-01-31 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Joseph Qi <joseph...@linux.alibaba.com> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc: Ulf Hansson <ulf.hans...@linar

[PATCH] block: Fix a race between the throttling code and request queue initialization

2018-01-31 Thread Bart Van Assche
ization. Additionally, initialize the spin lock that will be used as queue lock earlier if necessary. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Joseph Qi <joseph...@linux.alib

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Bart Van Assche
On Tue, 2018-01-30 at 20:22 -0700, Jens Axboe wrote: > On 1/30/18 8:21 PM, Bart Van Assche wrote: > > On Tue, 2018-01-30 at 20:17 -0700, Jens Axboe wrote: > > > BLK_STS_RESOURCE should always be safe to return, and it should work > > > the same as STS_DEV_RESOURCE, e

Re: [RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs

2018-01-30 Thread Bart Van Assche
On Tue, 2018-01-30 at 19:21 +0800, Joseph Qi wrote: > Hi Jens and Folks, > > Recently we've gotten a hard LOCKUP issue. After investigating the issue > we've found a race between blk_init_queue_node and blkcg_print_blkgs. > The race is described below. > > blk_init_queue_node

Re: [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Bart Van Assche
On Tue, 2018-01-30 at 14:33 -0500, Mike Snitzer wrote: > On Tue, Jan 30 2018 at 12:52pm -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > > - This patch does not fix any bugs nor makes block drivers easier to > > read or to implement. So why is t

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Bart Van Assche
On 01/30/18 06:24, Mike Snitzer wrote: +* +* If driver returns BLK_STS_RESOURCE and SCHED_RESTART +* bit is set, run queue after a delay to avoid IO stalls +* that could otherwise occur if the queue is idle. */ -

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-29 Thread Bart Van Assche
On Tue, 2018-01-30 at 09:07 +0800, Ming Lei wrote: > On Mon, Jan 29, 2018 at 04:48:31PM +0000, Bart Van Assche wrote: > > - It is easy to fix this race inside the block layer, namely by using > > call_rcu() inside the blk_mq_delay_run_hw_queue() implementation to > >

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-29 Thread Bart Van Assche
On 01/19/18 07:24, Jens Axboe wrote: That's what I thought. So for a low queue depth underlying queue, it's quite possible that this situation can happen. Two potential solutions I see: 1) As described earlier in this thread, having a mechanism for being notified when the scarce resource

Re: [PATCH V4] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-29 Thread Bart Van Assche
On Mon, 2018-01-29 at 17:14 -0500, Mike Snitzer wrote: > On Mon, Jan 29 2018 at 4:51pm -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > > On Mon, 2018-01-29 at 16:44 -0500, Mike Snitzer wrote: > > > But regardless of which wins the race, the queue w

Re: [PATCH V4] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-29 Thread Bart Van Assche
On Mon, 2018-01-29 at 16:44 -0500, Mike Snitzer wrote: > But regardless of which wins the race, the queue will have been run. > Which is all we care about right? Running the queue is not sufficient. With this patch applied it can happen that the block driver returns BLK_STS_DEV_RESOURCE, that the

Re: [PATCH V4] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-29 Thread Bart Van Assche
On Mon, 2018-01-29 at 15:33 -0500, Mike Snitzer wrote: > + * If driver returns BLK_STS_RESOURCE and SCHED_RESTART > + * bit is set, run queue after 10ms to avoid IO stalls > + * that could otherwise occur if the queue is idle. >*/ > -

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-29 Thread Bart Van Assche
On Sun, 2018-01-28 at 07:41 +0800, Ming Lei wrote: > Not mention, the request isn't added to dispatch list yet in .queue_rq(), > strictly speaking, it is not correct to call blk_mq_delay_run_hw_queue() in > .queue_rq(), so the current block layer API can't handle it well enough. I disagree that

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-28 Thread Bart Van Assche
On Sat, 2018-01-27 at 19:23 -0500, Mike Snitzer wrote: > On Sat, Jan 27 2018 at 5:12pm -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > - The patch at the start of this thread introduces a regression in the > > SCSI core, namely a queue stall if a r

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-28 Thread Bart Van Assche
On Sat, 2018-01-27 at 23:58 -0500, Mike Snitzer wrote: > On Sat, Jan 27 2018 at 10:00pm -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > > On Sat, 2018-01-27 at 21:03 -0500, Mike Snitzer wrote: > > > You cannot even be forthcoming about the t

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-27 Thread Bart Van Assche
On Sat, 2018-01-27 at 21:03 -0500, Mike Snitzer wrote: > You cannot even be forthcoming about the technical merit of a change you > authored (commit 6077c2d70) that I'm left to clean up in the face of > performance bottlenecks it unwittingly introduced? If you were being > honest: you'd grant

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-27 Thread Bart Van Assche
On Sat, 2018-01-27 at 19:23 -0500, Mike Snitzer wrote: > Your contributions do _not_ make up for your inability to work well with > others. Tiresome doesn't begin to describe these interactions. > > Life is too short to continue enduring your bullshit. > > But do let us know when you have

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-27 Thread Bart Van Assche
On Sat, 2018-01-27 at 14:09 -0500, Mike Snitzer wrote: > Ming let me know that he successfully tested this V3 patch using both > your test (fio to both mpath and underlying path) and Bart's (02-mq with > can_queue in guest). > > Would be great if you'd review and verify this fix works for you

Re: [PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-25 Thread Bart Van Assche
On Sat, 2018-01-13 at 23:54 +0800, Ming Lei wrote: > Thanks for your mentioning, then I can find the following comment in > srp_queuecommand(): > > /* >* The SCSI EH thread is the only context from which > srp_queuecommand() >* can get invoked for

Re: [PATCH v2] bsg: use pr_debug instead of hand craftet macros

2018-01-24 Thread Bart Van Assche
On Tue, 2018-01-23 at 11:55 +0100, Johannes Thumshirn wrote: > Use pr_debug instead of hand craftet macros. This way it is not needed to > re-compile the kernel to enable bsg debug outputs and it's possible to > selectively enable specific prints. Reviewed-by: Bart Van Assche <

Re: [PATCH v2] bsg: use pr_debug instead of hand craftet macros

2018-01-23 Thread Bart Van Assche
On Tue, 2018-01-23 at 04:45 -0800, Joe Perches wrote: > Perhaps the email subject could be improved to describe > the new macro and as well, this macro, without a pr_fmt > define somewhat above it loses the __func__ output too. Hmm ... I thought that the pr_debug() output can be configured to

Re: [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-23 Thread Bart Van Assche
On Wed, 2018-01-24 at 00:59 +0800, Ming Lei wrote: > How is that enough to fix the IO hang when driver returns STS_RESOURCE > and the queue is idle? If you want to follow previous dm-rq's way of > call blk_mq_delay_run_hw_queue() in .queue_rq(), the same trick need > to be applied to other drivers

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-23 Thread Bart Van Assche
On Wed, 2018-01-24 at 00:37 +0800, Ming Lei wrote: > On Tue, Jan 23, 2018 at 04:24:20PM +0000, Bart Van Assche wrote: > > My opinion about this patch is as follows: > > * Changing a blk_mq_delay_run_hw_queue() call followed by return > > BLK_STS_DEV_RESOURCE into return BLK_

Re: [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-23 Thread Bart Van Assche
On Wed, 2018-01-24 at 00:49 +0800, Ming Lei wrote: > On Tue, Jan 23, 2018 at 04:47:11PM +0000, Bart Van Assche wrote: > > On Wed, 2018-01-24 at 00:41 +0800, Ming Lei wrote: > > > Could you explain where to call call_rcu()? call_rcu() can't be used in > > > IO path at al

Re: [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-23 Thread Bart Van Assche
On Wed, 2018-01-24 at 00:41 +0800, Ming Lei wrote: > Could you explain where to call call_rcu()? call_rcu() can't be used in > IO path at all. Can you explain what makes you think that call_rcu() can't be used in the I/O path? As you know call_rcu() invokes a function asynchronously. From

Re: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle)

2018-01-23 Thread Bart Van Assche
On Tue, 2018-01-23 at 10:22 +0100, Mike Snitzer wrote: > On Thu, Jan 18 2018 at 5:20pm -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote: > > > And yet Laurence cannot reproduce any such lockups

Re: [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-23 Thread Bart Van Assche
On 01/23/18 08:26, Ming Lei wrote: On Tue, Jan 23, 2018 at 08:17:02AM -0800, Bart Van Assche wrote: On 01/22/18 16:57, Ming Lei wrote: Even though RCU lock is held during dispatch, preemption or interrupt can happen too, so it is simply wrong to depend on the timing to make sure

Re: [PATCH V3] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-23 Thread Bart Van Assche
On Wed, 2018-01-24 at 00:16 +0800, Ming Lei wrote: > @@ -1280,10 +1282,18 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, > struct list_head *list, >* - Some but not all block drivers stop a queue before >* returning BLK_STS_RESOURCE. Two exceptions are

Re: [PATCH v5 1/2] Return bytes transferred for partial direct I/O

2018-01-22 Thread Bart Van Assche
On Mon, 2018-01-22 at 16:14 -0700, Jens Axboe wrote: > On 1/22/18 3:25 PM, Elliott, Robert (Persistent Memory) wrote: > > fio engines/sg.c fio_sgio_rw_doio() has that pattern: > > > > ret = write(f->fd, hdr, sizeof(*hdr)); > > if (ret < 0) > > return ret; > > ... > >

Re: [PATCH TRIVIAL] bsg: use pr_debug instead of hand crafted macros

2018-01-22 Thread Bart Van Assche
On Mon, 2018-01-22 at 12:23 -0800, Joe Perches wrote: > On Mon, 2018-01-22 at 08:53 +0100, Johannes Thumshirn wrote: > > Use pr_debug instead of hand crafted macros. This way it is not needed to > > re-compile the kernel to enable bsg debug outputs and it's possible to > > selectively enable

Re: blk-mq hangs easily with LLVM+clang test suite

2018-01-22 Thread Bart Van Assche
On Sun, 2018-01-21 at 12:39 -0500, David Zarzycki wrote: > Hi Bart, > > I can do [1] and [2] but I won’t be able to provide the command output > because the hang is almost total. No new commands can run because the > scheduler is hung. For example, see this backtrace: > >

Re: [PATCH 4/5] blk-mq: introduce blk_get_request_notify

2018-01-22 Thread Bart Van Assche
On Mon, 2018-01-22 at 11:35 +0800, Ming Lei wrote: > DM-MPATH need to allocate request from underlying queue, but when the > allocation fails, there is no way to make underlying queue's RESTART > to restart DM's queue. > > This patch introduces blk_get_request_notify() for this purpose, and >

Re: [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-22 Thread Bart Van Assche
On Mon, 2018-01-22 at 11:35 +0800, Ming Lei wrote: > @@ -1280,10 +1282,18 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, > struct list_head *list, >* - Some but not all block drivers stop a queue before >* returning BLK_STS_RESOURCE. Two exceptions are

Re: [PATCH TRIVIAL] bsg: use pr_debug instead of hand crafted macros

2018-01-22 Thread Bart Van Assche
On 01/21/18 23:53, Johannes Thumshirn wrote: - dprintk("map hdr %llx/%u %llx/%u\n", (unsigned long long) hdr->dout_xferp, + pr_debug("map hdr %llx/%u %llx/%u\n", +(unsigned long long) hdr->dout_xferp, hdr->dout_xfer_len, (unsigned long long)

Re: blk-mq hangs easily with LLVM+clang test suite

2018-01-21 Thread Bart Van Assche
On Sun, 2018-01-21 at 11:19 -0500, David Zarzycki wrote: > How can I help debug this further? Please apply patch [1] on top of Jens' for-next branch [2], reproduce the hang and provide the output of the following command: (cd /sys/kernel/debug/block && find . -type f -exec grep -aH . {} \;)

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-19 Thread Bart Van Assche
On Fri, 2018-01-19 at 15:34 +0800, Ming Lei wrote: > Could you explain a bit when SCSI target replies with BUSY very often? > > Inside initiator, we have limited the max per-LUN requests and per-host > requests already before calling .queue_rq(). That's correct. However, when a SCSI initiator

[PATCH] lib/scatterlist: Fix chaining support in sgl_alloc_order()

2018-01-19 Thread Bart Van Assche
ret_from_fork+0x1f/0x30 Fixes: e80a0af4759a ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()") Reported-by: Laurence Oberman <lober...@redhat.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Nicholas A. Bellinger <n...@linux-iscsi.org> Cc: Laurence

[PATCH 3/3] block: Remove kblockd_schedule_delayed_work{,_on}()

2018-01-19 Thread Bart Van Assche
The previous patch removed all users of these two functions. Hence also remove the functions themselves. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk-core.c | 14 -- include/linux/blkdev.h | 2 -- 2 files changed, 16 deletions(-) diff --git a

[PATCH 0/3] Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays

2018-01-19 Thread Bart Van Assche
Hello Jens, Earlier today it was agreed that a blk_mq_delay_run_hw_queue() call followed by a blk_mq_run_hw_queue() call should result in an immediate queue run. Hence this patch series. Please consider this patch series for kernel v4.16. Thanks, Bart. Bart Van Assche (3): blk-mq: Rename

[PATCH 2/3] blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays

2018-01-19 Thread Bart Van Assche
Make sure that calling blk_mq_run_hw_queue() or blk_mq_kick_requeue_list() triggers a queue run without delay even if blk_mq_delay_run_hw_queue() has been called recently and if its delay has not yet expired. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk-mq

[PATCH 1/3] blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()

2018-01-19 Thread Bart Van Assche
Most blk-mq functions have a name that follows the pattern blk_mq_${action}. However, the function name blk_mq_request_direct_issue is an exception. Hence rename this function. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- blo

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-19 Thread Bart Van Assche
On Fri, 2018-01-19 at 23:33 +0800, Ming Lei wrote: > On Fri, Jan 19, 2018 at 03:20:13PM +0000, Bart Van Assche wrote: > > On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote: > > > Please see queue_delayed_work_on(), hctx->run_work is shared by all > > > scheduling,

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote: > Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and > it should be DM-only which returns STS_RESOURCE so often. That's wrong at least for SCSI. See also https://marc.info/?l=linux-block=151578329417076. Bart.

Re: dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 08:26 +0800, Ming Lei wrote: > No, this case won't return BLK_STS_RESOURCE. It's possible that my attempt to reverse engineer the latest blk-mq changes was wrong. But the queue stall is real and needs to be fixed. Bart.

Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 08:18 +0800, Ming Lei wrote: > On Fri, Jan 19, 2018 at 12:14:24AM +0000, Bart Van Assche wrote: > > On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote: > > > On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote: > > > > diff --git a/d

Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote: > On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote: > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > > index f16096af879a..c59c59cfd2a5 100644 > > --- a/drivers/md/dm-rq.c > > +++ b/drivers/md/d

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 15:39 -0700, Jens Axboe wrote: > When you do have a solid test case, please please submit a blktests > test case for it! This needs to be something we can regularly in > testing. Hello Jens, That sounds like a good idea to me. BTW, I think the reason why so far I can

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 17:18 -0500, Laurence Oberman wrote: > OK, I ran 5 at once of 5 separate mount points. > I am using 4k block sizes > Its solid consistent for me. No stalls no gaps. Hi Laurence, That's great news and thank you for having shared this information but I think it should be

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote: > And yet Laurence cannot reproduce any such lockups with your test... Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has already succeeded at running an unmodified version of my tests. In one of the e-mails Laurence

Re: [dm-devel] [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote: > On Thu, Jan 18 2018 at 3:58P -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote: > > > For Bart's test the underlying scsi-mq driver is w

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote: > For Bart's test the underlying scsi-mq driver is what is regularly > hitting this case in __blk_mq_try_issue_directly(): > > if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) Hello Mike, That code path is not the code path

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 13:30 -0500, Mike Snitzer wrote: > 1%!? Where are you getting that number? Ming has detailed more > significant performance gains than 1%.. and not just on lpfc (though you > keep seizing on lpfc because of the low queue_depth of 3). That's what I derived from the numbers

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 12:03 -0500, Mike Snitzer wrote: > On Thu, Jan 18 2018 at 11:50am -0500, > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > My comments about the above are as follows: > > - It can take up to q->rq_timeout jiffies after a .queue_rq() &g

Re: dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 11:50 -0500, Mike Snitzer wrote: > The issue you say it was originally intended to fix _should_ be > addressed with this change: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=4dd6edd23e7ea971efddc303f9e67eb79e95808e Hello

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche
On 01/17/18 18:41, Ming Lei wrote: BLK_STS_RESOURCE can be returned from driver when any resource is running out of. And the resource may not be related with tags, such as kmalloc(GFP_ATOMIC), when queue is idle under this kind of BLK_STS_RESOURCE, restart can't work any more, then IO hang may

[PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
is to call blk_mq_delay_run_hw_queue(). Hence this patch. Fixes: ec3eaf9a6731 ("dm mpath: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE") Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Ming Lei <ming@redhat.com> --- drivers/md/dm-rq.c | 1 + 1 file cha

Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback

2018-01-17 Thread Bart Van Assche
On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote: > On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote: > > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote: > > > On Wed, Jan 17 2018 at 11:50am -0500, > > > Jens Axboe <ax...@kernel.dk> wrote:

Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback

2018-01-17 Thread Bart Van Assche
On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote: > On Wed, Jan 17 2018 at 11:50am -0500, > Jens Axboe wrote: > > > On 1/17/18 9:25 AM, Mike Snitzer wrote: > > > Hi Jens, > > > > > > Think this finally takes care of it! ;) > > > > > > Thanks, > > > Mike > > > > > > Mike

[PATCH v2 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()

2018-01-17 Thread Bart Van Assche
() and also in an error path of blk_register_queue(): it is not allowed to hold sysfs_lock around the kobject_del(>kobj) call. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk-sysfs.c | 37 - 1 file changed, 28 insertions(+), 9

[PATCH v2 0/3] Avoid that blk_{un,}register_queue() trigger lock inversion

2018-01-17 Thread Bart Van Assche
changing the scheduler from sysfs and unregistering a block layer queue. Bart Van Assche (3): block: Unexport elv_register_queue() and elv_unregister_queue() block: Document scheduler modification locking requirements block: Protect less code with sysfs_lock in blk_{un,}register_queue() block

[PATCH v2 2/3] block: Document scheduler modification locking requirements

2018-01-17 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/elevator.c | 8 1 file changed, 8 insertions(+) diff --git a/block/elevator.c b/block/elevator.c index 4f00b53cd5fd..e87e9b43aba0 100644 --- a/block/elevator.c +++ b

[PATCH v2 1/3] block: Unexport elv_register_queue() and elv_unregister_queue()

2018-01-17 Thread Bart Van Assche
These two functions are only called from inside the block layer so unexport them. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk.h | 3 +++ block/elevator.c | 2 -- include/linux/elevator.h | 2 -- 3 files changed, 3 insertions(+), 4 deletions(-)

Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()

2018-01-17 Thread Bart Van Assche
On Wed, 2018-01-17 at 21:04 +0800, Ming Lei wrote: > On Wed, Jan 17, 2018 at 10:17 AM, Bart Van Assche > <bart.vanass...@wdc.com> wrote: > > Sorry but I think what you wrote is wrong. kobject_del(>kobj) waits > > until all > > ongoing sysfs callback methods, in

Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()

2018-01-16 Thread Bart Van Assche
On Wed, 2018-01-17 at 09:23 +0800, Ming Lei wrote: > On Wed, Jan 17, 2018 at 8:03 AM, Bart Van Assche <bart.vanass...@wdc.com> > wrote: > > On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: > > > Therefore it seems to me that all queue_att

Re: [LSF/MM TOPIC] A high-performance userspace block driver

2018-01-16 Thread Bart Van Assche
On Tue, 2018-01-16 at 06:52 -0800, Matthew Wilcox wrote: > I see the improvements that Facebook have been making to the nbd driver, > and I think that's a wonderful thing. Maybe the outcome of this topic > is simply: "Shut up, Matthew, this is good enough". > > It's clear that there's an

Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()

2018-01-16 Thread Bart Van Assche
On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: > Therefore it seems to me that all queue_attr_{show,store} are racey vs > blk_unregister_queue() removing the 'queue' kobject. > > And it was just that __elevator_change() was myopicly fixed to address > the race whereas a more generic

Re: [Lsf-pc] [LSF/MM TOPIC] A high-performance userspace block driver

2018-01-16 Thread Bart Van Assche
On Tue, 2018-01-16 at 15:28 -0800, James Bottomley wrote: > On Tue, 2018-01-16 at 18:23 -0500, Theodore Ts'o wrote: > > On Tue, Jan 16, 2018 at 06:52:40AM -0800, Matthew Wilcox wrote: > > > > > > > > > I see the improvements that Facebook have been making to the nbd > > > driver, and I think

[PATCH] block: Fix __bio_integrity_endio() documentation

2018-01-16 Thread Bart Van Assche
Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio") Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/bio-integrity.c | 1 - 1 file changed, 1 deletion(-) diff --git a/block/bio-integrity.c b/block/bio-integrity.c index 23b42e8aa03e..9cfdd6c83b5b 10

[PATCH] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()

2018-01-16 Thread Bart Van Assche
nce check, adding .dispatch_wait to a wait queue and removing the wait queue entry must all be protected by both the hctx lock and the wait queue lock. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Omar Sandoval <osan...@fb.com> C

Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready

2018-01-16 Thread Bart Van Assche
On Mon, 2018-01-15 at 18:13 -0500, Mike Snitzer wrote: > On Mon, Jan 15 2018 at 6:10P -0500, > Mike Snitzer <snit...@redhat.com> wrote: > > > On Mon, Jan 15 2018 at 5:51pm -0500, > > Bart Van Assche <bart.vanass...@wdc.com> wrote: > > > > > On Mo

[PATCH 2/3] block: Document scheduler change code locking requirements

2018-01-16 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/elevator.c | 8 1 file changed, 8 insertions(+) diff --git a/block/elevator.c b/block/elevator.c index 4f00b53cd5fd..e87e9b43aba0 100644 --- a/block/elevator.c +++ b

[PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()

2018-01-16 Thread Bart Van Assche
() and also in an error path of blk_register_queue(): it is not allowed to hold sysfs_lock around the kobject_del(>kobj) call. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk-sysfs.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/

[PATCH 0/3] Avoid that blk_{un,}register_queue() trigger lock inversion

2018-01-16 Thread Bart Van Assche
Hello Jens, The three patches in this series are what I came up with after having analyzed a lockdep complaint triggered by blk_unregister_queue. Please consider these patches for kernel v4.16. Thanks, Bart. Bart Van Assche (3): block: Unexport elv_register_queue() and elv_unregister_queue

[PATCH 1/3] block: Unexport elv_register_queue() and elv_unregister_queue()

2018-01-16 Thread Bart Van Assche
These two functions are only called from inside the block layer so unexport them. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk.h | 3 +++ block/elevator.c | 2 -- include/linux/elevator.h | 2 -- 3 files changed, 3 insertions(+), 4 deletions(-)

Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready

2018-01-15 Thread Bart Van Assche
On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote: > sysfs write op calls kernfs_fop_write which takes: > of->mutex then kn->count#213 (no idea what that is) > then q->sysfs_lock (via queue_attr_store) > > vs > > blk_unregister_queue takes: > q->sysfs_lock then > kernfs_mutex (via

Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready

2018-01-15 Thread Bart Van Assche
On Mon, 2018-01-15 at 12:48 -0500, Mike Snitzer wrote: > Do I need to do something more to enable lockdep aside from set > CONFIG_LOCKDEP_SUPPORT=y ? Hello Mike, I think you also need to set CONFIG_PROVE_LOCKING=y. Bart.

Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready

2018-01-15 Thread Bart Van Assche
On Mon, 2018-01-15 at 12:29 -0500, Mike Snitzer wrote: > So you replied to v5, I emailed a v6 out for the relevant patch. Just > want to make sure you're testing with either Jens' latest tree or are > using my v6 that fixed blk_mq_unregister_dev() to require caller holds > q->sysfs_lock ? Hello

Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready

2018-01-15 Thread Bart Van Assche
On Fri, 2018-01-12 at 10:06 -0500, Mike Snitzer wrote: > I'm submitting this v5 with more feeling now ;) Hello Mike, Have these patches been tested with lockdep enabled? The following appeared in the kernel log when after I started testing Jens' for-next tree of this morning:

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 19:52 -0500, Mike Snitzer wrote: > It was 50 ms before it was 100 ms. No real explaination for these > values other than they seem to make Bart's IB SRP testbed happy? But that constant was not introduced by me in the dm code. See e.g. the following commits: commit

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 18:17 -0500, Mike Snitzer wrote: > @@ -1570,7 +1570,10 @@ static int multipath_end_io(struct dm_target *ti, > struct request *clone, > if (error && blk_path_error(error)) { > struct multipath *m = ti->private; > > - r = DM_ENDIO_REQUEUE; > +

Re: [PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-12 Thread Bart Van Assche
On Thu, 2018-01-11 at 10:23 +0800, Ming Lei wrote: > > not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd(). > > Given it is error handling, do we need to prevent the .queuecommand() call > in scsi_send_eh_cmnd()? Could you share us what the actual issue > observed is from

Re: [PATCH] blk-mq-debugfs: Also show requests that have not yet been started

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 15:05 -0700, Jens Axboe wrote: > On 1/12/18 3:00 PM, Bart Van Assche wrote: > > On Fri, 2018-01-12 at 14:55 -0700, Jens Axboe wrote: > > > On 1/12/18 2:52 PM, Bart Van Assche wrote: > > > > When debugging e.g. the SCSI timeout handler it is i

[PATCH] blk-mq-debugfs: Also show requests that have not yet been started

2018-01-12 Thread Bart Van Assche
NULL pointer dereference"). Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Ming Lei <ming@redhat.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Reinecke <h...@suse.com> Cc: Johannes Thumshirn <jthumsh...@suse.de> Cc: Martin K. Petersen <

Re: [PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 14:07 -0700, Jens Axboe wrote: > You're really not making it easy for folks to run this :-) My hope is that the ib_srp and ib_srpt patches will be accepted upstream soon. As long as these are not upstream, anyone who wants to retrieve these patches is welcome to clone

Re: [PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-12 Thread Bart Van Assche
On Tue, 2018-01-09 at 08:29 -0800, Tejun Heo wrote: > Currently, blk-mq timeout path synchronizes against the usual > issue/completion path using a complex scheme involving atomic > bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence > rules. Unfortunatley, it contains quite a few

Re: [PATCH V3 2/5] dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure

2018-01-12 Thread Bart Van Assche
On Thu, 2018-01-11 at 14:01 +0800, Ming Lei wrote: > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 86bf502a8e51..fcddf5a62581 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@ -533,8 +533,20 @@ static int multipath_clone_and_map(struct dm_target *ti, >

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 13:06 -0500, Mike Snitzer wrote: > OK, you have the stage: please give me a pointer to your best > explaination of the several. Since the previous discussion about this topic occurred more than a month ago it could take more time to look up an explanation than to explain it

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 12:40 -0500, Mike Snitzer wrote: > You've not explained it many times. That's not correct. I have already several times posted a detailed and easy to understand explanation > We cannot get a straight answer from you. That is a gross and incorrect statement. Please calm

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-12 Thread Bart Van Assche
On Fri, 2018-01-12 at 12:18 -0500, Mike Snitzer wrote: > This is going upstream for 4.16: > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=dm-4.16=5b18cff4baedde77e0d69bd62a13ae78f9488d89 That is really gross. I have explained many times in detail on the dm-devel list

Re: [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred

2018-01-11 Thread Bart Van Assche
On Thu, 2018-01-11 at 15:14 -0500, Mike Snitzer wrote: > -void device_add_disk(struct device *parent, struct gendisk *disk) > +void device_add_disk_no_queue_reg(struct device *parent, struct gendisk > *disk) > { > dev_t devt; > int retval; > @@ -682,7 +682,6 @@ void

Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue

2018-01-11 Thread Bart Van Assche
a823fb34a8b ("block: fix warning when I/O elevator is changed as > request_queue is being removed") > Cc: sta...@vger.kernel.org # 4.14+ > Reported-by: Bart Van Assche <bart.vanass...@wdc.com> > Signed-off-by: Mike Snitzer <snit...@redhat.com> > --- > block/blk-

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-11 Thread Bart Van Assche
On Thu, 2018-01-11 at 17:58 -0500, Mike Snitzer wrote: > The changes are pretty easy to review. This notion that these changes > are problematic rings very hollow given your lack of actual numbers (or > some other concerning observation rooted in testing fact) to back up > your position. It's

Re: [PATCH V3 0/5] dm-rq: improve sequential I/O performance

2018-01-11 Thread Bart Van Assche
On Thu, 2018-01-11 at 17:07 -0500, Mike Snitzer wrote: > Bart, if for some reason we regress for some workload you're able to > more readily test we can deal with it. But I'm too encouraged by Ming's > performance improvements to hold these changes back any further. Sorry Mike but I think Ming's

<    3   4   5   6   7   8   9   10   11   12   >