On 01/19/18 07:24, Jens Axboe wrote:
That's what I thought. So for a low queue depth underlying queue, it's
quite possible that this situation can happen. Two potential solutions
I see:
1) As described earlier in this thread, having a mechanism for being
notified when the scarce resource bec
On Tue, 2018-01-23 at 10:22 +0100, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 5:20pm -0500,
> Bart Van Assche wrote:
>
> > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> > > And yet Laurence cannot reproduce any such lockups with your test...
> >
> > Hmm ... maybe I misunderstood La
On Tue, Jan 23 2018 at 7:17am -0500,
Ming Lei wrote:
> On Tue, Jan 23, 2018 at 8:15 PM, Mike Snitzer wrote:
> > On Tue, Jan 23 2018 at 5:53am -0500,
> > Ming Lei wrote:
> >
> >> Hi Mike,
> >>
> >> On Tue, Jan 23, 2018 at 10:22:04AM +0100, Mike Snitzer wrote:
> >> >
> >> > From: Mike Snitzer
On Tue, Jan 23, 2018 at 8:15 PM, Mike Snitzer wrote:
> On Tue, Jan 23 2018 at 5:53am -0500,
> Ming Lei wrote:
>
>> Hi Mike,
>>
>> On Tue, Jan 23, 2018 at 10:22:04AM +0100, Mike Snitzer wrote:
>> > On Thu, Jan 18 2018 at 5:20pm -0500,
>> > Bart Van Assche wrote:
>> >
>> > > On Thu, 2018-01-18 a
On Tue, Jan 23 2018 at 5:53am -0500,
Ming Lei wrote:
> Hi Mike,
>
> On Tue, Jan 23, 2018 at 10:22:04AM +0100, Mike Snitzer wrote:
> > On Thu, Jan 18 2018 at 5:20pm -0500,
> > Bart Van Assche wrote:
> >
> > > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> > > > And yet Laurence cann
Hi Mike,
On Tue, Jan 23, 2018 at 10:22:04AM +0100, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 5:20pm -0500,
> Bart Van Assche wrote:
>
> > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> > > And yet Laurence cannot reproduce any such lockups with your test...
> >
> > Hmm ... maybe I
On Thu, Jan 18 2018 at 5:20pm -0500,
Bart Van Assche wrote:
> On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> > And yet Laurence cannot reproduce any such lockups with your test...
>
> Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has
> already succeeded at run
On 1/19/18 4:52 PM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 10:38:41AM -0700, Jens Axboe wrote:
>> On 1/19/18 9:37 AM, Ming Lei wrote:
>>> On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
On 1/19/18 9:26 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axbo
On Fri, Jan 19, 2018 at 09:23:35AM -0700, Jens Axboe wrote:
> On 1/19/18 9:13 AM, Mike Snitzer wrote:
> > On Fri, Jan 19 2018 at 10:48am -0500,
> > Jens Axboe wrote:
> >
> >> On 1/19/18 8:40 AM, Ming Lei wrote:
> >> Where does the dm STS_RESOURCE error usually come from - what's exact
> >
On Fri, Jan 19, 2018 at 10:38:41AM -0700, Jens Axboe wrote:
> On 1/19/18 9:37 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
> >> On 1/19/18 9:26 AM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
> On 1/19/18 9:05 AM, Min
On Fri, 2018-01-19 at 15:34 +0800, Ming Lei wrote:
> Could you explain a bit when SCSI target replies with BUSY very often?
>
> Inside initiator, we have limited the max per-LUN requests and per-host
> requests already before calling .queue_rq().
That's correct. However, when a SCSI initiator and
On Fri, Jan 19 2018 at 12:38pm -0500,
Jens Axboe wrote:
> On 1/19/18 9:37 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
> >>
> >> There are no pending requests for this case, nothing to restart the
> >> queue. When you fail that blk_get_request(), you are idl
On Fri, Jan 19, 2018 at 10:38:41AM -0700, Jens Axboe wrote:
> On 1/19/18 9:37 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
> >> On 1/19/18 9:26 AM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
> On 1/19/18 9:05 AM, Min
On 1/19/18 9:37 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
>> On 1/19/18 9:26 AM, Ming Lei wrote:
>>> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
On 1/19/18 9:05 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axbo
On Fri, Jan 19, 2018 at 10:09:11AM -0700, Jens Axboe wrote:
> On 1/19/18 10:05 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 09:52:32AM -0700, Jens Axboe wrote:
> >> On 1/19/18 9:47 AM, Mike Snitzer wrote:
> >>> On Fri, Jan 19 2018 at 11:41am -0500,
> >>> Jens Axboe wrote:
> >>>
> On 1/19/1
On 1/19/18 10:05 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:52:32AM -0700, Jens Axboe wrote:
>> On 1/19/18 9:47 AM, Mike Snitzer wrote:
>>> On Fri, Jan 19 2018 at 11:41am -0500,
>>> Jens Axboe wrote:
>>>
On 1/19/18 9:37 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:27:46AM -0700,
On Fri, Jan 19, 2018 at 09:52:32AM -0700, Jens Axboe wrote:
> On 1/19/18 9:47 AM, Mike Snitzer wrote:
> > On Fri, Jan 19 2018 at 11:41am -0500,
> > Jens Axboe wrote:
> >
> >> On 1/19/18 9:37 AM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
> On 1/19/18 9:
On 1/19/18 9:47 AM, Mike Snitzer wrote:
> On Fri, Jan 19 2018 at 11:41am -0500,
> Jens Axboe wrote:
>
>> On 1/19/18 9:37 AM, Ming Lei wrote:
>>> On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
On 1/19/18 9:26 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jen
On Fri, Jan 19 2018 at 11:41am -0500,
Jens Axboe wrote:
> On 1/19/18 9:37 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
> >> On 1/19/18 9:26 AM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
> >>
> >> There are no pending r
On 1/19/18 9:37 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
>> On 1/19/18 9:26 AM, Ming Lei wrote:
>>> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
On 1/19/18 9:05 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axbo
On Fri, Jan 19, 2018 at 09:27:46AM -0700, Jens Axboe wrote:
> On 1/19/18 9:26 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
> >> On 1/19/18 9:05 AM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axboe wrote:
> On 1/19/18 8:40 AM, Min
On 1/19/18 9:26 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
>> On 1/19/18 9:05 AM, Ming Lei wrote:
>>> On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axboe wrote:
On 1/19/18 8:40 AM, Ming Lei wrote:
Where does the dm STS_RESOURCE error usually com
On Fri, Jan 19, 2018 at 09:19:24AM -0700, Jens Axboe wrote:
> On 1/19/18 9:05 AM, Ming Lei wrote:
> > On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axboe wrote:
> >> On 1/19/18 8:40 AM, Ming Lei wrote:
> >> Where does the dm STS_RESOURCE error usually come from - what's exact
> >> resource
On 1/19/18 9:13 AM, Mike Snitzer wrote:
> On Fri, Jan 19 2018 at 10:48am -0500,
> Jens Axboe wrote:
>
>> On 1/19/18 8:40 AM, Ming Lei wrote:
>> Where does the dm STS_RESOURCE error usually come from - what's exact
>> resource are we running out of?
>
> It is from blk_get_request(u
On 1/19/18 9:05 AM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axboe wrote:
>> On 1/19/18 8:40 AM, Ming Lei wrote:
>> Where does the dm STS_RESOURCE error usually come from - what's exact
>> resource are we running out of?
>
> It is from blk_get_request(underly
On Fri, Jan 19 2018 at 10:48am -0500,
Jens Axboe wrote:
> On 1/19/18 8:40 AM, Ming Lei wrote:
> Where does the dm STS_RESOURCE error usually come from - what's exact
> resource are we running out of?
> >>>
> >>> It is from blk_get_request(underlying queue), see
> >>> multipath_clone_and
On Fri, 2018-01-19 at 23:33 +0800, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 03:20:13PM +, Bart Van Assche wrote:
> > On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
> > > Please see queue_delayed_work_on(), hctx->run_work is shared by all
> > > scheduling, once blk_mq_delay_run_hw_queue(100
On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axboe wrote:
> On 1/19/18 8:40 AM, Ming Lei wrote:
> Where does the dm STS_RESOURCE error usually come from - what's exact
> resource are we running out of?
> >>>
> >>> It is from blk_get_request(underlying queue), see
> >>> multipath_clone_a
On 1/19/18 8:40 AM, Ming Lei wrote:
Where does the dm STS_RESOURCE error usually come from - what's exact
resource are we running out of?
>>>
>>> It is from blk_get_request(underlying queue), see
>>> multipath_clone_and_map().
>>
>> That's what I thought. So for a low queue depth underlyi
On Fri, Jan 19, 2018 at 08:24:06AM -0700, Jens Axboe wrote:
> On 1/19/18 12:26 AM, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 09:02:45PM -0700, Jens Axboe wrote:
> >> On 1/18/18 7:32 PM, Ming Lei wrote:
> >>> On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
> On 1/18/18 11:47 AM, B
On Fri, Jan 19, 2018 at 03:20:13PM +, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
> > Please see queue_delayed_work_on(), hctx->run_work is shared by all
> > scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
> > scheduling can make progress during
On 1/19/18 8:20 AM, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
>> Please see queue_delayed_work_on(), hctx->run_work is shared by all
>> scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
>> scheduling can make progress during the 100ms.
>
> How abou
On 1/19/18 12:26 AM, Ming Lei wrote:
> On Thu, Jan 18, 2018 at 09:02:45PM -0700, Jens Axboe wrote:
>> On 1/18/18 7:32 PM, Ming Lei wrote:
>>> On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
On 1/18/18 11:47 AM, Bart Van Assche wrote:
>> This is all very tiresome.
>
> Ye
On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
> Please see queue_delayed_work_on(), hctx->run_work is shared by all
> scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
> scheduling can make progress during the 100ms.
How about addressing that as follows:
diff --git a/block/bl
On Fri, Jan 19, 2018 at 05:09:46AM +, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote:
> > Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> > it should be DM-only which returns STS_RESOURCE so often.
>
> That's wrong at least for SCSI. See al
On Thu, Jan 18, 2018 at 09:02:45PM -0700, Jens Axboe wrote:
> On 1/18/18 7:32 PM, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
> >> On 1/18/18 11:47 AM, Bart Van Assche wrote:
> This is all very tiresome.
> >>>
> >>> Yes, this is tiresome. It is very annoying
On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote:
> Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> it should be DM-only which returns STS_RESOURCE so often.
That's wrong at least for SCSI. See also
https://marc.info/?l=linux-block&m=151578329417076.
Bart.
On 1/18/18 7:32 PM, Ming Lei wrote:
> On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
>> On 1/18/18 11:47 AM, Bart Van Assche wrote:
This is all very tiresome.
>>>
>>> Yes, this is tiresome. It is very annoying to me that others keep
>>> introducing so many regressions in such impo
On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
> On 1/18/18 11:47 AM, Bart Van Assche wrote:
> >> This is all very tiresome.
> >
> > Yes, this is tiresome. It is very annoying to me that others keep
> > introducing so many regressions in such important parts of the kernel.
> > It is a
On Thu, 2018-01-18 at 15:39 -0700, Jens Axboe wrote:
> When you do have a solid test case, please please submit a blktests
> test case for it! This needs to be something we can regularly in
> testing.
Hello Jens,
That sounds like a good idea to me. BTW, I think the reason why so far I
can reprodu
On 1/18/18 3:35 PM, Laurence Oberman wrote:
> On Thu, 2018-01-18 at 22:24 +, Bart Van Assche wrote:
>> On Thu, 2018-01-18 at 17:18 -0500, Laurence Oberman wrote:
>>> OK, I ran 5 at once of 5 separate mount points.
>>> I am using 4k block sizes
>>> Its solid consistent for me. No stalls no gaps.
On Thu, 2018-01-18 at 22:24 +, Bart Van Assche wrote:
> On Thu, 2018-01-18 at 17:18 -0500, Laurence Oberman wrote:
> > OK, I ran 5 at once of 5 separate mount points.
> > I am using 4k block sizes
> > Its solid consistent for me. No stalls no gaps.
>
> Hi Laurence,
>
> That's great news and t
On Thu, 2018-01-18 at 17:18 -0500, Laurence Oberman wrote:
> OK, I ran 5 at once of 5 separate mount points.
> I am using 4k block sizes
> Its solid consistent for me. No stalls no gaps.
Hi Laurence,
That's great news and thank you for having shared this information but I think
it should be menti
On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> And yet Laurence cannot reproduce any such lockups with your test...
Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has
already succeeded at running an unmodified version of my tests. In one of the
e-mails Laurence se
On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 4:39pm -0500,
> Bart Van Assche wrote:
>
> > On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> > > On Thu, Jan 18 2018 at 3:58P -0500,
> > > Bart Van Assche wrote:
> > >
> > > > On Thu, 2018-01-18 at 15:48
On Thu, Jan 18 2018 at 4:39pm -0500,
Bart Van Assche wrote:
> On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> > On Thu, Jan 18 2018 at 3:58P -0500,
> > Bart Van Assche wrote:
> >
> > > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > > > For Bart's test the underlying scsi-
On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 3:58P -0500,
> Bart Van Assche wrote:
>
> > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > > For Bart's test the underlying scsi-mq driver is what is regularly
> > > hitting this case in __blk_mq_try_issu
On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 3:58P -0500,
> Bart Van Assche wrote:
>
> > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > > For Bart's test the underlying scsi-mq driver is what is
> > > regularly
> > > hitting this case in __blk_mq_tr
On Thu, Jan 18 2018 at 3:58P -0500,
Bart Van Assche wrote:
> On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > For Bart's test the underlying scsi-mq driver is what is regularly
> > hitting this case in __blk_mq_try_issue_directly():
> >
> > if (blk_mq_hctx_stopped(hctx) || blk
On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> For Bart's test the underlying scsi-mq driver is what is regularly
> hitting this case in __blk_mq_try_issue_directly():
>
> if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))
Hello Mike,
That code path is not the code path th
On Thu, Jan 18 2018 at 3:11pm -0500,
Jens Axboe wrote:
> On 1/18/18 11:47 AM, Bart Van Assche wrote:
> >> This is all very tiresome.
> >
> > Yes, this is tiresome. It is very annoying to me that others keep
> > introducing so many regressions in such important parts of the kernel.
> > It is als
On 1/18/18 11:47 AM, Bart Van Assche wrote:
>> This is all very tiresome.
>
> Yes, this is tiresome. It is very annoying to me that others keep
> introducing so many regressions in such important parts of the kernel.
> It is also annoying to me that I get blamed if I report a regression
> instead
On Thu, 2018-01-18 at 13:30 -0500, Mike Snitzer wrote:
> 1%!? Where are you getting that number? Ming has detailed more
> significant performance gains than 1%.. and not just on lpfc (though you
> keep seizing on lpfc because of the low queue_depth of 3).
That's what I derived from the numbers y
On Thu, Jan 18 2018 at 12:20pm -0500,
Bart Van Assche wrote:
> On Thu, 2018-01-18 at 12:03 -0500, Mike Snitzer wrote:
> > On Thu, Jan 18 2018 at 11:50am -0500,
> > Bart Van Assche wrote:
> > > My comments about the above are as follows:
> > > - It can take up to q->rq_timeout jiffies after a .qu
On Thu, 2018-01-18 at 12:03 -0500, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 11:50am -0500,
> Bart Van Assche wrote:
> > My comments about the above are as follows:
> > - It can take up to q->rq_timeout jiffies after a .queue_rq()
> > implementation returned BLK_STS_RESOURCE before blk_mq_tim
On Thu, Jan 18 2018 at 11:50am -0500,
Bart Van Assche wrote:
> On 01/17/18 18:41, Ming Lei wrote:
> >BLK_STS_RESOURCE can be returned from driver when any resource
> >is running out of. And the resource may not be related with tags,
> >such as kmalloc(GFP_ATOMIC), when queue is idle under this ki
On 01/17/18 18:41, Ming Lei wrote:
BLK_STS_RESOURCE can be returned from driver when any resource
is running out of. And the resource may not be related with tags,
such as kmalloc(GFP_ATOMIC), when queue is idle under this kind of
BLK_STS_RESOURCE, restart can't work any more, then IO hang may
be
BLK_STS_RESOURCE can be returned from driver when any resource
is running out of. And the resource may not be related with tags,
such as kmalloc(GFP_ATOMIC), when queue is idle under this kind of
BLK_STS_RESOURCE, restart can't work any more, then IO hang may
be caused.
Most of drivers may call km
58 matches
Mail list logo