On Fri, 2017-05-19 at 14:53 -0700, Himanshu Madhani wrote:
> From: Quinn Tran <quinn.t...@cavium.com>
> 
> In case of hardware queue full, commands can loop between
> TCM stack and tcm_qla2xx shim layers for retry. While command
> is waiting for retry, task mgmt can get ahead and abort the
> cmmand that encountered queue full condition. Fix this by
> dropping the command, if task mgmt has already started the
> command free process.
> 
> Cc: <sta...@vger.kernel.org>
> Signed-off-by: Quinn Tran <quinn.t...@cavium.com>
> Signed-off-by: Himanshu Madhani <himanshu.madh...@cavium.com>
> ---
>  drivers/scsi/qla2xxx/tcm_qla2xxx.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
> b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> index 7443e4efa3ae..07f8ad001bcb 100644
> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> @@ -686,6 +686,20 @@ static int tcm_qla2xxx_queue_status(struct se_cmd 
> *se_cmd)
>                               struct qla_tgt_cmd, se_cmd);
>       int xmit_type = QLA_TGT_XMIT_STATUS;
>  
> +     if (cmd->aborted) {
> +             /*
> +              * Cmd can loop during Q-full. tcm_qla2xxx_aborted_task
> +              * can get ahead of this cmd. tcm_qla2xxx_aborted_task
> +              * already kick start the free.
> +              */
> +             pr_debug(
> +                 "queue_data_in aborted cmd[%p] refcount %d transport_state 
> %x, t_state %x, se_cmd_flags %x\n",
> +                 cmd, kref_read(&cmd->se_cmd.cmd_kref),
> +                 cmd->se_cmd.transport_state, cmd->se_cmd.t_state,
> +                 cmd->se_cmd.se_cmd_flags);
> +             return 0;
> +     }
> +
>       cmd->bufflen = se_cmd->data_length;
>       cmd->sg = NULL;
>       cmd->sg_cnt = 0;

As your comment above mentions, there is a known issue in target-core
when queue-full occurs repeatably and a se_cmd descriptor is explicitly
stopped via session shutdown, TMR ABORT_TASK / LUN_RESET or otherwise.

We hit this scenario a while back in iser-target with iw_cxgb hw, which
due to a separate bug was constantly triggering queue-full under load to
uncover this same case.

So I'm still considering the different approaches to address this in
target-core proper, but don't have a problem with merging this as-is as
it won't logically conflict with any of those changes.

That said:

Acked-by: Nicholas Bellinger <n...@linux-iscsi.org>

Reply via email to