On Mon, Apr 10, 2017 at 10:54:01AM -0700, Bart Van Assche wrote:
> This patch avoids that sd_shutdown() hangs on the SYNCHRONIZE CACHE
> command if the block layer queue has been stopped by
> scsi_target_block().
> 
> Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
> Cc: Israel Rukshin <isra...@mellanox.com>
> Cc: Max Gurtovoy <m...@mellanox.com>
> Cc: Hannes Reinecke <h...@suse.de>
> ---
>  drivers/scsi/sd.c | 25 +++++++++++++++++++++++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index fe0f7997074e..8e98b7684893 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1489,6 +1489,22 @@ static unsigned int sd_check_events(struct gendisk 
> *disk, unsigned int clearing)
>       return retval;
>  }
> 
> +/*
> + * Issue a SYNCHRONIZE CACHE command asynchronously. Since 
> blk_cleanup_queue()
> + * waits for all commands to finish, __scsi_remove_device() will wait for the
> + * SYNCHRONIZE CACHE command to finish.
> + */
> +static int sd_sync_cache_async(struct scsi_disk *sdkp)
> +{
> +     const struct scsi_device *sdp = sdkp->device;
> +     const int timeout = sdp->request_queue->rq_timeout *
> +                         SD_FLUSH_TIMEOUT_MULTIPLIER;
> +     const unsigned char cmd[10] = { SYNCHRONIZE_CACHE };
> +
> +     return scsi_execute_async(sdp, cmd, DMA_NONE, NULL, 0, timeout,
> +                               SD_MAX_RETRIES, 0, 0);
> +}
> +

OK, so I take it the problem is when the queue is stopped, then the
completion in blk_execute_rq() will never be triggered and then we wait
for a timeout there, or potentially forever?

But then what is the point in trying to do it async here anyway? Won't
that just be doomed in the same way, just that we don't see the effect?

>  static int sd_sync_cache(struct scsi_disk *sdkp)
>  {
>       int retries, res;
> @@ -3356,6 +3372,8 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, 
> int start)
>  static void sd_shutdown(struct device *dev)
>  {
>       struct scsi_disk *sdkp = dev_get_drvdata(dev);
> +     const bool stop_disk = system_state != SYSTEM_RESTART &&
> +             sdkp->device->manage_start_stop;
> 
>       if (!sdkp)
>               return;         /* this can happen */

That seems wrong then. You already dereference sdkp before the function
checks whether or not the pointer is valid (at least if you can trust
the comment there).

> @@ -3365,10 +3383,13 @@ static void sd_shutdown(struct device *dev)
> 
>       if (sdkp->WCE && sdkp->media_present) {
>               sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
> -             sd_sync_cache(sdkp);
> +             if (stop_disk)
> +                     sd_sync_cache(sdkp);
> +             else
> +                     sd_sync_cache_async(sdkp);

That makes the function-documentation obsolete, doesn't it?


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block

>       }
> 
> -     if (system_state != SYSTEM_RESTART && sdkp->device->manage_start_stop) {
> +     if (stop_disk) {
>               sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n");
>               sd_start_stop_device(sdkp, 0);
>       }
> -- 
> 2.12.0
> 
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
                  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

Reply via email to