Hello Hannes,

On Wed, Mar 01, 2017 at 10:15:18AM +0100, Hannes Reinecke wrote:
> When a command is sent as part of the error handling there
> is not point whatsoever to start EH escalation when that
> command fails; we are _already_ in the error handler,
> and the escalation is about to commence anyway.
> So just call 'scsi_try_to_abort_cmd()' to abort outstanding
> commands and let the main EH routine handle the rest.
>
> Signed-off-by: Hannes Reinecke <h...@suse.de>
> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
> Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
> ---
>  drivers/scsi/scsi_error.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
>
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index e1ca3b8..4613aa1 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -889,15 +889,6 @@ static int scsi_try_to_abort_cmd(struct 
> scsi_host_template *hostt,
>       return hostt->eh_abort_handler(scmd);
>  }
>
> -static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd)
> -{
> -     if (scsi_try_to_abort_cmd(scmd->device->host->hostt, scmd) != SUCCESS)
> -             if (scsi_try_bus_device_reset(scmd) != SUCCESS)
> -                     if (scsi_try_target_reset(scmd) != SUCCESS)
> -                             if (scsi_try_bus_reset(scmd) != SUCCESS)
> -                                     scsi_try_host_reset(scmd);
> -}
> -
>  /**
>   * scsi_eh_prep_cmnd  - Save a scsi command info as part of error recovery
>   * @scmd:       SCSI command structure to hijack
> @@ -1082,7 +1073,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
> unsigned char *cmnd,
>                       break;
>               }
>       } else if (rtn != FAILED) {
> -             scsi_abort_eh_cmnd(scmd);
> +             scsi_try_to_abort_cmd(shost->hostt, scmd);
>               rtn = FAILED;
>       }

The idea is sound, but this implementation would cause "use-after-free"s.

I only know our own LLD well enough to judge, but with zFCP there will
always be a chance that an abort fails - be it memory pressure,
hardware/firmware behavior or internal EH in zFCP.

Calling queuecommand() will mean for us in the LLD, that we allocate a
unique internal request struct for the scsi_cmnd (struct
zfcp_fsf_request) and add that to our internal hash-table with
outstanding commands. We assume this scsi_cmnd-pointer is ours till we
complete it via scsi_done are yield it via successful EH-actions.

In case the abort fails, you fail to take back the ownership over the
scsi command. Which in turn means possible "use-after-free"s when we
still thinks the scsi command is ours, but EH has already overwritten
the scsi-command with the original one. When we still get an answer or
otherwise use the scsi_cmnd-pointer we would access an invalid one.

I guess this might as well be true for other LLDs.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block

>
> --
> 1.8.5.6
>

--
Linux on z Systems Development         /         IBM Systems & Technology Group
                  IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

Reply via email to