Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Sagi Grimberg



Can't we just not go through the scheduler for reserved tags? Obviously
there is no point in scheduling them...


Right, that would be possible. But I'd rather not treat any requests
differently, it's a huge pain in the ass that flush request currently
insert with a driver tag already allocated. So it's not because
scheduling will add anything at all, it's more that I'd like to move
flush requests to use regular inserts as well and not deal with some
request being "special" in any way.

The below should hopefully work. Totally untested...

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 54c84363c1b2..e48bc2c72615 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -181,7 +181,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
struct blk_mq_ctx *ctx, unsigned int tag)
 {
-   if (tag >= tags->nr_reserved_tags) {
+   if (!blk_mq_tag_is_reserved(tags, tag)) {
const int real_tag = tag - tags->nr_reserved_tags;

BUG_ON(real_tag >= tags->nr_tags);
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 63497423c5cd..5cb51e53cc03 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -85,4 +85,10 @@ static inline void blk_mq_tag_set_rq(struct blk_mq_hw_ctx 
*hctx,
hctx->tags->rqs[tag] = rq;
 }

+static inline bool blk_mq_tag_is_reserved(struct blk_mq_tags *tags,
+ unsigned int tag)
+{
+   return tag < tags->nr_reserved_tags;
+}
+
 #endif
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9611cd9920e9..293e79c1ee95 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -853,6 +853,9 @@ bool blk_mq_get_driver_tag(struct request *rq, struct 
blk_mq_hw_ctx **hctx,
return true;
}

+   if (blk_mq_tag_is_reserved(data.hctx->sched_tags, rq->internal_tag))
+   data.flags |= BLK_MQ_REQ_RESERVED;
+
rq->tag = blk_mq_get_tag();
if (rq->tag >= 0) {
if (blk_mq_tag_busy(data.hctx)) {




Both patches look they'd work, I'll test. Thanks.


Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 09:15:27AM -0700, Jens Axboe wrote:
> On 02/27/2017 09:10 AM, Sagi Grimberg wrote:
> > 
> >>> Hm, this may fix the crash, but I'm not sure it'll work as intended.
> >>> When we allocate the request, we'll get a reserved scheduler tag, but
> >>> then when we go to dispatch the request and call
> >>> blk_mq_get_driver_tag(), we'll be competing with all of the normal
> >>> requests for a regular driver tag. So maybe on top of this we should add
> >>> the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
> >>> blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
> >>> what we expect from reserved tags, so feel free to call me crazy.
> >>
> >> Yeah good point, we need to carry it through. Reserved tags exist
> >> because drivers often need a request/tag for error handling. If all
> >> tags currently are used up for regular IO that is stuck, you need
> >> a reserved tag for error handling to guarantee progress.
> >>
> >> So Sagi's patch does take it half the way there, but get_driver_tag
> >> really needs to know about this as well, or we will just get stuck
> >> there as well. Two solutions, I can think of:
> >>
> >> 1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
> >>when allocating a driver tag if above X.
> >> 2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
> >>get_driver_tag if that is set.
> >>
> >> Comments?
> > 
> > Can't we just not go through the scheduler for reserved tags? Obviously
> > there is no point in scheduling them...
> 
> Right, that would be possible. But I'd rather not treat any requests
> differently, it's a huge pain in the ass that flush request currently
> insert with a driver tag already allocated. So it's not because
> scheduling will add anything at all, it's more that I'd like to move
> flush requests to use regular inserts as well and not deal with some
> request being "special" in any way.
> 
> The below should hopefully work. Totally untested...

I like your variant if it works for Sagi. My only complaint (which was
already there) is that the BUG_ON(tag >= tags->nr_reserved_tags) in
blk_mq_put_tag() looks kind of silly since we just checked that exact
same condition.

> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 54c84363c1b2..e48bc2c72615 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -181,7 +181,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data 
> *data)
>  void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
>   struct blk_mq_ctx *ctx, unsigned int tag)
>  {
> - if (tag >= tags->nr_reserved_tags) {
> + if (!blk_mq_tag_is_reserved(tags, tag)) {
>   const int real_tag = tag - tags->nr_reserved_tags;
>  
>   BUG_ON(real_tag >= tags->nr_tags);
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 63497423c5cd..5cb51e53cc03 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -85,4 +85,10 @@ static inline void blk_mq_tag_set_rq(struct blk_mq_hw_ctx 
> *hctx,
>   hctx->tags->rqs[tag] = rq;
>  }
>  
> +static inline bool blk_mq_tag_is_reserved(struct blk_mq_tags *tags,
> +   unsigned int tag)
> +{
> + return tag < tags->nr_reserved_tags;
> +}
> +
>  #endif
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 9611cd9920e9..293e79c1ee95 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -853,6 +853,9 @@ bool blk_mq_get_driver_tag(struct request *rq, struct 
> blk_mq_hw_ctx **hctx,
>   return true;
>   }
>  
> + if (blk_mq_tag_is_reserved(data.hctx->sched_tags, rq->internal_tag))
> + data.flags |= BLK_MQ_REQ_RESERVED;
> +
>   rq->tag = blk_mq_get_tag();
>   if (rq->tag >= 0) {
>   if (blk_mq_tag_busy(data.hctx)) {
> 
> 
> -- 
> Jens Axboe
> 


Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Jens Axboe
On 02/27/2017 09:14 AM, Omar Sandoval wrote:
> On Mon, Feb 27, 2017 at 06:10:01PM +0200, Sagi Grimberg wrote:
>>
 Hm, this may fix the crash, but I'm not sure it'll work as intended.
 When we allocate the request, we'll get a reserved scheduler tag, but
 then when we go to dispatch the request and call
 blk_mq_get_driver_tag(), we'll be competing with all of the normal
 requests for a regular driver tag. So maybe on top of this we should add
 the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
 blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
 what we expect from reserved tags, so feel free to call me crazy.
>>>
>>> Yeah good point, we need to carry it through. Reserved tags exist
>>> because drivers often need a request/tag for error handling. If all
>>> tags currently are used up for regular IO that is stuck, you need
>>> a reserved tag for error handling to guarantee progress.
>>>
>>> So Sagi's patch does take it half the way there, but get_driver_tag
>>> really needs to know about this as well, or we will just get stuck
>>> there as well. Two solutions, I can think of:
>>>
>>> 1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
>>>when allocating a driver tag if above X.
>>> 2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
>>>get_driver_tag if that is set.
>>>
>>> Comments?
> 
> Option 1 looks simple enough that I don't think it warrants a new
> request flag (compile tested only):
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 9e6b064e5339..87590f7d4f80 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -852,6 +852,9 @@ bool blk_mq_get_driver_tag(struct request *rq, struct 
> blk_mq_hw_ctx **hctx,
>   return true;
>   }
>  
> + if (rq->internal_tag < data.hctx->sched_tags->nr_reserved_tags)
> + data.flags |= BLK_MQ_REQ_RESERVED;
> +
>   rq->tag = blk_mq_get_tag();
>   if (rq->tag >= 0) {
>   if (blk_mq_tag_busy(data.hctx)) {

Agree, that's identical to what I just sent out as well, functionally.

>> Can't we just not go through the scheduler for reserved tags? Obviously
>> there is no point in scheduling them...
> 
> That sounds nice, since I'd be worried about the scheduler also needing
> to be aware of the reserved status lest it also get the reserved request
> stuck behind some normal requests. But, we special case flush in this
> way, and it has been a huge pain.

The caller better be using head insertion for this, in case we already
have requests in the queue. But that's no different than the current
logic.

So I think it should work fine.

-- 
Jens Axboe



Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 06:10:01PM +0200, Sagi Grimberg wrote:
> 
> > > Hm, this may fix the crash, but I'm not sure it'll work as intended.
> > > When we allocate the request, we'll get a reserved scheduler tag, but
> > > then when we go to dispatch the request and call
> > > blk_mq_get_driver_tag(), we'll be competing with all of the normal
> > > requests for a regular driver tag. So maybe on top of this we should add
> > > the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
> > > blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
> > > what we expect from reserved tags, so feel free to call me crazy.
> > 
> > Yeah good point, we need to carry it through. Reserved tags exist
> > because drivers often need a request/tag for error handling. If all
> > tags currently are used up for regular IO that is stuck, you need
> > a reserved tag for error handling to guarantee progress.
> > 
> > So Sagi's patch does take it half the way there, but get_driver_tag
> > really needs to know about this as well, or we will just get stuck
> > there as well. Two solutions, I can think of:
> > 
> > 1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
> >when allocating a driver tag if above X.
> > 2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
> >get_driver_tag if that is set.
> > 
> > Comments?

Option 1 looks simple enough that I don't think it warrants a new
request flag (compile tested only):

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9e6b064e5339..87590f7d4f80 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -852,6 +852,9 @@ bool blk_mq_get_driver_tag(struct request *rq, struct 
blk_mq_hw_ctx **hctx,
return true;
}
 
+   if (rq->internal_tag < data.hctx->sched_tags->nr_reserved_tags)
+   data.flags |= BLK_MQ_REQ_RESERVED;
+
rq->tag = blk_mq_get_tag();
if (rq->tag >= 0) {
if (blk_mq_tag_busy(data.hctx)) {

> Can't we just not go through the scheduler for reserved tags? Obviously
> there is no point in scheduling them...

That sounds nice, since I'd be worried about the scheduler also needing
to be aware of the reserved status lest it also get the reserved request
stuck behind some normal requests. But, we special case flush in this
way, and it has been a huge pain.


Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Jens Axboe
On 02/27/2017 09:10 AM, Sagi Grimberg wrote:
> 
>>> Hm, this may fix the crash, but I'm not sure it'll work as intended.
>>> When we allocate the request, we'll get a reserved scheduler tag, but
>>> then when we go to dispatch the request and call
>>> blk_mq_get_driver_tag(), we'll be competing with all of the normal
>>> requests for a regular driver tag. So maybe on top of this we should add
>>> the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
>>> blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
>>> what we expect from reserved tags, so feel free to call me crazy.
>>
>> Yeah good point, we need to carry it through. Reserved tags exist
>> because drivers often need a request/tag for error handling. If all
>> tags currently are used up for regular IO that is stuck, you need
>> a reserved tag for error handling to guarantee progress.
>>
>> So Sagi's patch does take it half the way there, but get_driver_tag
>> really needs to know about this as well, or we will just get stuck
>> there as well. Two solutions, I can think of:
>>
>> 1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
>>when allocating a driver tag if above X.
>> 2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
>>get_driver_tag if that is set.
>>
>> Comments?
> 
> Can't we just not go through the scheduler for reserved tags? Obviously
> there is no point in scheduling them...

Right, that would be possible. But I'd rather not treat any requests
differently, it's a huge pain in the ass that flush request currently
insert with a driver tag already allocated. So it's not because
scheduling will add anything at all, it's more that I'd like to move
flush requests to use regular inserts as well and not deal with some
request being "special" in any way.

The below should hopefully work. Totally untested...

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 54c84363c1b2..e48bc2c72615 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -181,7 +181,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
struct blk_mq_ctx *ctx, unsigned int tag)
 {
-   if (tag >= tags->nr_reserved_tags) {
+   if (!blk_mq_tag_is_reserved(tags, tag)) {
const int real_tag = tag - tags->nr_reserved_tags;
 
BUG_ON(real_tag >= tags->nr_tags);
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 63497423c5cd..5cb51e53cc03 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -85,4 +85,10 @@ static inline void blk_mq_tag_set_rq(struct blk_mq_hw_ctx 
*hctx,
hctx->tags->rqs[tag] = rq;
 }
 
+static inline bool blk_mq_tag_is_reserved(struct blk_mq_tags *tags,
+ unsigned int tag)
+{
+   return tag < tags->nr_reserved_tags;
+}
+
 #endif
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9611cd9920e9..293e79c1ee95 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -853,6 +853,9 @@ bool blk_mq_get_driver_tag(struct request *rq, struct 
blk_mq_hw_ctx **hctx,
return true;
}
 
+   if (blk_mq_tag_is_reserved(data.hctx->sched_tags, rq->internal_tag))
+   data.flags |= BLK_MQ_REQ_RESERVED;
+
rq->tag = blk_mq_get_tag();
if (rq->tag >= 0) {
if (blk_mq_tag_busy(data.hctx)) {


-- 
Jens Axboe



Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Sagi Grimberg



Hm, this may fix the crash, but I'm not sure it'll work as intended.
When we allocate the request, we'll get a reserved scheduler tag, but
then when we go to dispatch the request and call
blk_mq_get_driver_tag(), we'll be competing with all of the normal
requests for a regular driver tag. So maybe on top of this we should add
the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
what we expect from reserved tags, so feel free to call me crazy.


Yeah good point, we need to carry it through. Reserved tags exist
because drivers often need a request/tag for error handling. If all
tags currently are used up for regular IO that is stuck, you need
a reserved tag for error handling to guarantee progress.

So Sagi's patch does take it half the way there, but get_driver_tag
really needs to know about this as well, or we will just get stuck
there as well. Two solutions, I can think of:

1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
   when allocating a driver tag if above X.
2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
   get_driver_tag if that is set.

Comments?


Can't we just not go through the scheduler for reserved tags? Obviously
there is no point in scheduling them...


Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 05:36:20PM +0200, Sagi Grimberg wrote:
> Signed-off-by: Sagi Grimberg 
> ---
>  block/blk-mq-sched.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 98c7b061781e..46ca965fff5c 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -454,7 +454,8 @@ int blk_mq_sched_setup(struct request_queue *q)
>*/
>   ret = 0;
>   queue_for_each_hw_ctx(q, hctx, i) {
> - hctx->sched_tags = blk_mq_alloc_rq_map(set, i, q->nr_requests, 
> 0);
> + hctx->sched_tags = blk_mq_alloc_rq_map(set, i,
> + q->nr_requests, set->reserved_tags);
>   if (!hctx->sched_tags) {
>   ret = -ENOMEM;
>   break;
> -- 
> 2.7.4

Hm, this may fix the crash, but I'm not sure it'll work as intended.
When we allocate the request, we'll get a reserved scheduler tag, but
then when we go to dispatch the request and call
blk_mq_get_driver_tag(), we'll be competing with all of the normal
requests for a regular driver tag. So maybe on top of this we should add
the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
what we expect from reserved tags, so feel free to call me crazy.


Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Jens Axboe
On 02/27/2017 08:49 AM, Omar Sandoval wrote:
> On Mon, Feb 27, 2017 at 05:36:20PM +0200, Sagi Grimberg wrote:
>> Signed-off-by: Sagi Grimberg 
>> ---
>>  block/blk-mq-sched.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>> index 98c7b061781e..46ca965fff5c 100644
>> --- a/block/blk-mq-sched.c
>> +++ b/block/blk-mq-sched.c
>> @@ -454,7 +454,8 @@ int blk_mq_sched_setup(struct request_queue *q)
>>   */
>>  ret = 0;
>>  queue_for_each_hw_ctx(q, hctx, i) {
>> -hctx->sched_tags = blk_mq_alloc_rq_map(set, i, q->nr_requests, 
>> 0);
>> +hctx->sched_tags = blk_mq_alloc_rq_map(set, i,
>> +q->nr_requests, set->reserved_tags);
>>  if (!hctx->sched_tags) {
>>  ret = -ENOMEM;
>>  break;
>> -- 
>> 2.7.4
> 
> Hm, this may fix the crash, but I'm not sure it'll work as intended.
> When we allocate the request, we'll get a reserved scheduler tag, but
> then when we go to dispatch the request and call
> blk_mq_get_driver_tag(), we'll be competing with all of the normal
> requests for a regular driver tag. So maybe on top of this we should add
> the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
> blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
> what we expect from reserved tags, so feel free to call me crazy.

Yeah good point, we need to carry it through. Reserved tags exist
because drivers often need a request/tag for error handling. If all
tags currently are used up for regular IO that is stuck, you need
a reserved tag for error handling to guarantee progress.

So Sagi's patch does take it half the way there, but get_driver_tag
really needs to know about this as well, or we will just get stuck
there as well. Two solutions, I can think of:

1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
   when allocating a driver tag if above X.
2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
   get_driver_tag if that is set.

Comments?

-- 
Jens Axboe



Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Jens Axboe
On 02/27/2017 08:36 AM, Sagi Grimberg wrote:
> Signed-off-by: Sagi Grimberg 

Thanks for finding these, Sagi. Applied 1-2 for this series.

-- 
Jens Axboe



[PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg 
---
 block/blk-mq-sched.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 98c7b061781e..46ca965fff5c 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -454,7 +454,8 @@ int blk_mq_sched_setup(struct request_queue *q)
 */
ret = 0;
queue_for_each_hw_ctx(q, hctx, i) {
-   hctx->sched_tags = blk_mq_alloc_rq_map(set, i, q->nr_requests, 
0);
+   hctx->sched_tags = blk_mq_alloc_rq_map(set, i,
+   q->nr_requests, set->reserved_tags);
if (!hctx->sched_tags) {
ret = -ENOMEM;
break;
-- 
2.7.4