Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get

2015-04-29 Thread Christoph Hellwig
On Wed, Apr 29, 2015 at 10:17:59AM +0900, Akinobu Mita wrote:
 This change broke ufs driver.

I'd claim the ufs driver, or rather more specifily the ufs spec
had already been broken.  That whole concept of keeping references
to scsi devices to send commands to for host state changes is just
fundamentally broken.  

 The reason for scsi_device_get() in ufshcd_set_dev_pwr_mode() is
 to avoid manual delete of UFS device W-LUN by holding the reference
 to it.  So can we acquire shost-scan_mutex lock instead of
 scsi_device_get()?  I tried attached patch and it seems to be working,
 but I would like to ask your opinion about this change.

It seems like a reasonable workaround.  But then again the concept
of the driver hanging on scsi_devices is and will stay broken.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get

2015-04-28 Thread Akinobu Mita
2015-02-02 22:01 GMT+09:00 Christoph Hellwig h...@lst.de:
 This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on
 module removal (and individual device removal) and dc4515ea (scsi: always
 increment reference count).

 We now never call scsi_device_get from the shutdown path, and the fact
 that we started grabbing reference there in commit 85b6c7 turned out
 turned out to create more problems than it solves, and required
 workarounds for workarounds for workarounds. Move back to properly checking
 the device state and carefully handle module refcounting.

 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  drivers/scsi/scsi.c | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)

 diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
 index 9b38299..9b7fd0b 100644
 --- a/drivers/scsi/scsi.c
 +++ b/drivers/scsi/scsi.c
 @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode);
   * Description: Gets a reference to the scsi_device and increments the use 
 count
   * of the underlying LLDD module.  You must hold host_lock of the
   * parent Scsi_Host or already have a reference when calling this.
 + *
 + * This will fail if a device is deleted or cancelled, or when the LLD module
 + * is in the process of being unloaded.
   */
  int scsi_device_get(struct scsi_device *sdev)

Hi Christoph,

This change broke ufs driver.

Because scsi_device_get() can be called while the module is being
unloaded with the device runtime suspended.
(i.e. driver_detach - ... pm_runtime_get_sync() ... -
ufshcd_runtime_resume - ufshcd_resume - ufshcd_set_dev_pwr_mode -
scsi_device_get - try_module_get - return -ENXIO)

The reason for scsi_device_get() in ufshcd_set_dev_pwr_mode() is
to avoid manual delete of UFS device W-LUN by holding the reference
to it.  So can we acquire shost-scan_mutex lock instead of
scsi_device_get()?  I tried attached patch and it seems to be working,
but I would like to ask your opinion about this change.

  {
 -   if (sdev-sdev_state == SDEV_DEL)
 -   return -ENXIO;
 +   if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL)
 +   goto fail;
 if (!get_device(sdev-sdev_gendev))
 -   return -ENXIO;
 -   /* We can fail try_module_get if we're doing SCSI operations
 -* from module exit (like cache flush) */
 -   __module_get(sdev-host-hostt-module);
 -
 +   goto fail;
 +   if (!try_module_get(sdev-host-hostt-module))
 +   goto fail_put_device;
 return 0;
 +
 +fail_put_device:
 +   put_device(sdev-sdev_gendev);
 +fail:
 +   return -ENXIO;
  }
  EXPORT_SYMBOL(scsi_device_get);

 --
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
From a305a5284cac23adbf7f86b3014cc2e6325c7b88 Mon Sep 17 00:00:00 2001
From: Akinobu Mita akinobu.m...@gmail.com
Date: Wed, 29 Apr 2015 10:02:17 +0900
Subject: [PATCH] scsi: ufs: fix ufshcd_set_dev_pwr_mode() when unloading
 module

---
 drivers/scsi/ufs/ufshcd.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 540e00d..91cbc04 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4695,23 +4695,19 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	struct scsi_sense_hdr sshdr;
 	struct scsi_device *sdp;
 	unsigned long flags;
-	int ret;
+	int ret = 0;
 
+	mutex_lock(hba-host-scan_mutex);
 	spin_lock_irqsave(hba-host-host_lock, flags);
 	sdp = hba-sdev_ufs_device;
-	if (sdp) {
-		ret = scsi_device_get(sdp);
-		if (!ret  !scsi_device_online(sdp)) {
-			ret = -ENODEV;
-			scsi_device_put(sdp);
-		}
-	} else {
+	if (!sdp || !scsi_device_online(sdp))
 		ret = -ENODEV;
-	}
 	spin_unlock_irqrestore(hba-host-host_lock, flags);
 
-	if (ret)
+	if (ret) {
+		mutex_unlock(hba-host-scan_mutex);
 		return ret;
+	}
 
 	/*
 	 * If scsi commands fail, the scsi mid-layer schedules scsi error-
@@ -4748,7 +4744,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	if (!ret)
 		hba-curr_dev_pwr_mode = pwr_mode;
 out:
-	scsi_device_put(sdp);
+	mutex_unlock(hba-host-scan_mutex);
 	hba-host-eh_noresume = 0;
 	return ret;
 }
-- 
1.9.1



Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get

2015-03-05 Thread Paolo Bonzini


On 02/02/2015 14:01, Christoph Hellwig wrote:
 This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on
 module removal (and individual device removal) and dc4515ea (scsi: always
 increment reference count).
 
 We now never call scsi_device_get from the shutdown path, and the fact
 that we started grabbing reference there in commit 85b6c7 turned out
 turned out to create more problems than it solves, and required
 workarounds for workarounds for workarounds. Move back to properly checking
 the device state and carefully handle module refcounting.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  drivers/scsi/scsi.c | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
 index 9b38299..9b7fd0b 100644
 --- a/drivers/scsi/scsi.c
 +++ b/drivers/scsi/scsi.c
 @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode);
   * Description: Gets a reference to the scsi_device and increments the use 
 count
   * of the underlying LLDD module.  You must hold host_lock of the
   * parent Scsi_Host or already have a reference when calling this.
 + *
 + * This will fail if a device is deleted or cancelled, or when the LLD module
 + * is in the process of being unloaded.
   */
  int scsi_device_get(struct scsi_device *sdev)
  {
 - if (sdev-sdev_state == SDEV_DEL)
 - return -ENXIO;
 + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL)
 + goto fail;
   if (!get_device(sdev-sdev_gendev))
 - return -ENXIO;
 - /* We can fail try_module_get if we're doing SCSI operations
 -  * from module exit (like cache flush) */
 - __module_get(sdev-host-hostt-module);
 -
 + goto fail;
 + if (!try_module_get(sdev-host-hostt-module))
 + goto fail_put_device;
   return 0;
 +
 +fail_put_device:
 + put_device(sdev-sdev_gendev);
 +fail:
 + return -ENXIO;
  }
  EXPORT_SYMBOL(scsi_device_get);
  
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get

2015-01-29 Thread James Bottomley
On Thu, 2015-01-29 at 00:00 +0100, Christoph Hellwig wrote:
 This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on
 module removal (and individual device removal) and dc4515ea (scsi: always
 increment reference count).
 
 We now never call scsi_device_get from the shutdown path, and the fact
 that we started grabbing reference there in commit 85b6c7 turned out
 turned out to create more problems than it solves, and required
 workarounds for workarounds for workarounds. Move back to properly checking
 the device state and carefully handle module refcounting.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  drivers/scsi/scsi.c | 17 ++---
  1 file changed, 10 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
 index 9b38299..95f0293 100644
 --- a/drivers/scsi/scsi.c
 +++ b/drivers/scsi/scsi.c
 @@ -982,15 +982,18 @@ EXPORT_SYMBOL(scsi_report_opcode);
   */
  int scsi_device_get(struct scsi_device *sdev)
  {
 - if (sdev-sdev_state == SDEV_DEL)
 - return -ENXIO;
 + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL)
 + goto fail;
   if (!get_device(sdev-sdev_gendev))
 - return -ENXIO;
 - /* We can fail try_module_get if we're doing SCSI operations
 -  * from module exit (like cache flush) */
 - __module_get(sdev-host-hostt-module);
 -
 + goto fail;
 + if (!try_module_get(sdev-host-hostt-module))
 + goto fail_put_device;
   return 0;
 +
 +fail_put_device:
 + put_device(sdev-sdev_gendev);
 +fail:
 + return -ENXIO;
  }
  EXPORT_SYMBOL(scsi_device_get);

If we can get away with this, I'm all for this approach.  However, you
need to document in a comment or above the function that it may not be
called in module exit functions and why.

Other than the comment issue, the series looks good,

Thanks,

James