On Sat, 28 Sep 2019, Damien Le Moal wrote:
> When a non-passthrough command is terminated with CHECK CONDITION,
> request sense is executed by hijacking the command descriptor. Since
> scsi_eh_prep_cmnd() and scsi_eh_restore_cmnd() do not save/restore the
> original command resid, the value returned on failure of the original
> command is lost and replaced with the value set by the execution of the
> request sense command. This value may in many instances be unaligned to
> the device sector size, causing sd_done() to print a warning message
> about the incorrect unaligned resid before the command is retried or
> aborted.
>
> Fix this problem by saving the original command resid in struct
> scsi_eh_save using scsi_eh_prep_cmnd() and restoring it in
> scsi_eh_restore_cmnd(). In addition, to make sure that the request sense
> command is executed with a correctly initialized command structure, also
> reset resid to 0 in scsi_eh_prep_cmnd() after saving the original
> command resid value in struct scsi_eh_save.
>
> Cc: [email protected]
> Signed-off-by: Damien Le Moal <[email protected]>
> ---
>
> Changes from V1:
> * Dropped patch 2
> * Add resid reset in scsi_eh_prep_cmnd()
>
> drivers/scsi/scsi_error.c | 3 +++
> include/scsi/scsi_eh.h | 1 +
> 2 files changed, 4 insertions(+)
>
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 1c470e31ae81..f53828bf7ad7 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -967,6 +967,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct
> scsi_eh_save *ses,
> ses->data_direction = scmd->sc_data_direction;
> ses->sdb = scmd->sdb;
> ses->result = scmd->result;
> + ses->resid = scsi_get_resid(scmd);
> ses->underflow = scmd->underflow;
> ses->prot_op = scmd->prot_op;
> ses->eh_eflags = scmd->eh_eflags;
> @@ -977,6 +978,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct
> scsi_eh_save *ses,
> memset(scmd->cmnd, 0, BLK_MAX_CDB);
> memset(&scmd->sdb, 0, sizeof(scmd->sdb));
> scmd->result = 0;
> + scsi_set_resid(scmd, 0);
>
> if (sense_bytes) {
> scmd->sdb.length = min_t(unsigned, SCSI_SENSE_BUFFERSIZE,
> @@ -1029,6 +1031,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd,
> struct scsi_eh_save *ses)
> scmd->sc_data_direction = ses->data_direction;
> scmd->sdb = ses->sdb;
> scmd->result = ses->result;
> + scsi_set_resid(scmd, ses->resid);
When saving and restoring state, perhaps it makes more sense to bypass the
higher level getter/setter API? Open-coded assignment statements are
already prevalent here, rather than calls to e.g. scsi_set_prot_op(),
set_msg_byte() etc. (There may be no code elsewhere that could tell the
difference, but we can't use "private" members to prove it, unlike C++.)
> scmd->underflow = ses->underflow;
> scmd->prot_op = ses->prot_op;
> scmd->eh_eflags = ses->eh_eflags;
> diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
> index 3810b340551c..9caa9b262a32 100644
> --- a/include/scsi/scsi_eh.h
> +++ b/include/scsi/scsi_eh.h
> @@ -32,6 +32,7 @@ extern int scsi_ioctl_reset(struct scsi_device *, int
> __user *);
> struct scsi_eh_save {
> /* saved state */
> int result;
> + unsigned int resid;
There seems to be an inconsistency here. A signed int would be consistent
with the getter and setter helpers. Whereas, if you open-coded the
assignments instead, your unsigned int would make sense because
scsi_request.resid_len really is an unsigned int.
--
> int eh_eflags;
> enum dma_data_direction data_direction;
> unsigned underflow;
>