On Fri, 2018-11-16 at 10:53 +0100, Hannes Reinecke wrote:
> On 11/15/18 6:58 PM, Keith Busch wrote:
> > The scsi timeout error handling had been directly updating the block
> > layer's request state to prevent a error handling and a natural completion
> > from completing the same request twice. Fix this layering violation
> > by having scsi control the fate of its commands with scsi owned flags
> > rather than use blk-mq's.
> > 
> > Signed-off-by: Keith Busch <keith.bu...@intel.com>
> > ---
> >   drivers/scsi/scsi_error.c | 22 +++++++++++-----------
> >   drivers/scsi/scsi_lib.c   |  6 +++++-
> >   include/scsi/scsi_cmnd.h  |  5 ++++-
> >   3 files changed, 20 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index dd338a8cd275..e92e088f636f 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -297,19 +297,19 @@ enum blk_eh_timer_return scsi_times_out(struct 
> > request *req)
> >   
> >     if (rtn == BLK_EH_DONE) {
> >             /*
> > -            * For blk-mq, we must set the request state to complete now
> > -            * before sending the request to the scsi error handler. This
> > -            * will prevent a use-after-free in the event the LLD manages
> > -            * to complete the request before the error handler finishes
> > -            * processing this timed out request.
> > +            * Set the command to complete first in order to prevent a real
> > +            * completion from releasing the command while error handling
> > +            * is using it. If the command was already completed, then the
> > +            * lower level driver beat the timeout handler, and it is safe
> > +            * to return without escalating error recovery.
> >              *
> > -            * If the request was already completed, then the LLD beat the
> > -            * time out handler from transferring the request to the scsi
> > -            * error handler. In that case we can return immediately as no
> > -            * further action is required.
> > +            * If timeout handling lost the race to a real completion, the
> > +            * block layer may ignore that due to a fake timeout injection,
> > +            * so return RESET_TIMER to allow error handling another shot
> > +            * at this command.
> >              */
> > -           if (!blk_mq_mark_complete(req))
> > -                   return rtn;
> > +           if (test_and_set_bit(__SCMD_COMPLETE, &scmd->flags))
> > +                   return BLK_EH_RESET_TIMER;
> >             if (scsi_abort_command(scmd) != SUCCESS) {
> >                     set_host_byte(scmd, DID_TIME_OUT);
> >                     scsi_eh_scmd_add(scmd);
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 5d83a162d03b..c1d5e4e36125 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1635,8 +1635,11 @@ static blk_status_t scsi_mq_prep_fn(struct request 
> > *req)
> >   
> >   static void scsi_mq_done(struct scsi_cmnd *cmd)
> >   {
> > +   if (unlikely(test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)))
> > +           return;
> >     trace_scsi_dispatch_cmd_done(cmd);
> > -   blk_mq_complete_request(cmd->request);
> > +   if (unlikely(!blk_mq_complete_request(cmd->request)))
> > +           clear_bit(__SCMD_COMPLETE, &cmd->flags);
> >   }
> >   
> >   static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx)
> > @@ -1701,6 +1704,7 @@ static blk_status_t scsi_queue_rq(struct 
> > blk_mq_hw_ctx *hctx,
> >                     goto out_dec_host_busy;
> >             req->rq_flags |= RQF_DONTPREP;
> >     } else {
> > +           cmd->flags &= ~SCMD_COMPLETE;
> >             blk_mq_start_request(req);
> >     }
> >   
> 
> Why do you use a normal assignment here (and not a clear_bit() as in the 
> above hunk)?
> 
> And shouldn't you add a barrier here to broadcast the state change to 
> other cores?

Although I don't like it that request state information is being moved from 
the block layer core into the SCSI core, I think that this patch is fine.
scsi_queue_rq() is only called after a request structure has been recycled
so I don't think that there can be any kind of race between the code paths
that use atomic instructions to manipulate the __SCMD_COMPLETE bit and
scsi_queue_rq().

Bart.

Reply via email to