Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get
On Wed, Apr 29, 2015 at 10:17:59AM +0900, Akinobu Mita wrote: This change broke ufs driver. I'd claim the ufs driver, or rather more specifily the ufs spec had already been broken. That whole concept of keeping references to scsi devices to send commands to for host state changes is just fundamentally broken. The reason for scsi_device_get() in ufshcd_set_dev_pwr_mode() is to avoid manual delete of UFS device W-LUN by holding the reference to it. So can we acquire shost-scan_mutex lock instead of scsi_device_get()? I tried attached patch and it seems to be working, but I would like to ask your opinion about this change. It seems like a reasonable workaround. But then again the concept of the driver hanging on scsi_devices is and will stay broken. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get
2015-02-02 22:01 GMT+09:00 Christoph Hellwig h...@lst.de: This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on module removal (and individual device removal) and dc4515ea (scsi: always increment reference count). We now never call scsi_device_get from the shutdown path, and the fact that we started grabbing reference there in commit 85b6c7 turned out turned out to create more problems than it solves, and required workarounds for workarounds for workarounds. Move back to properly checking the device state and carefully handle module refcounting. Signed-off-by: Christoph Hellwig h...@lst.de --- drivers/scsi/scsi.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 9b38299..9b7fd0b 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode); * Description: Gets a reference to the scsi_device and increments the use count * of the underlying LLDD module. You must hold host_lock of the * parent Scsi_Host or already have a reference when calling this. + * + * This will fail if a device is deleted or cancelled, or when the LLD module + * is in the process of being unloaded. */ int scsi_device_get(struct scsi_device *sdev) Hi Christoph, This change broke ufs driver. Because scsi_device_get() can be called while the module is being unloaded with the device runtime suspended. (i.e. driver_detach - ... pm_runtime_get_sync() ... - ufshcd_runtime_resume - ufshcd_resume - ufshcd_set_dev_pwr_mode - scsi_device_get - try_module_get - return -ENXIO) The reason for scsi_device_get() in ufshcd_set_dev_pwr_mode() is to avoid manual delete of UFS device W-LUN by holding the reference to it. So can we acquire shost-scan_mutex lock instead of scsi_device_get()? I tried attached patch and it seems to be working, but I would like to ask your opinion about this change. { - if (sdev-sdev_state == SDEV_DEL) - return -ENXIO; + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL) + goto fail; if (!get_device(sdev-sdev_gendev)) - return -ENXIO; - /* We can fail try_module_get if we're doing SCSI operations -* from module exit (like cache flush) */ - __module_get(sdev-host-hostt-module); - + goto fail; + if (!try_module_get(sdev-host-hostt-module)) + goto fail_put_device; return 0; + +fail_put_device: + put_device(sdev-sdev_gendev); +fail: + return -ENXIO; } EXPORT_SYMBOL(scsi_device_get); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html From a305a5284cac23adbf7f86b3014cc2e6325c7b88 Mon Sep 17 00:00:00 2001 From: Akinobu Mita akinobu.m...@gmail.com Date: Wed, 29 Apr 2015 10:02:17 +0900 Subject: [PATCH] scsi: ufs: fix ufshcd_set_dev_pwr_mode() when unloading module --- drivers/scsi/ufs/ufshcd.c | 18 +++--- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 540e00d..91cbc04 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4695,23 +4695,19 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba, struct scsi_sense_hdr sshdr; struct scsi_device *sdp; unsigned long flags; - int ret; + int ret = 0; + mutex_lock(hba-host-scan_mutex); spin_lock_irqsave(hba-host-host_lock, flags); sdp = hba-sdev_ufs_device; - if (sdp) { - ret = scsi_device_get(sdp); - if (!ret !scsi_device_online(sdp)) { - ret = -ENODEV; - scsi_device_put(sdp); - } - } else { + if (!sdp || !scsi_device_online(sdp)) ret = -ENODEV; - } spin_unlock_irqrestore(hba-host-host_lock, flags); - if (ret) + if (ret) { + mutex_unlock(hba-host-scan_mutex); return ret; + } /* * If scsi commands fail, the scsi mid-layer schedules scsi error- @@ -4748,7 +4744,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba, if (!ret) hba-curr_dev_pwr_mode = pwr_mode; out: - scsi_device_put(sdp); + mutex_unlock(hba-host-scan_mutex); hba-host-eh_noresume = 0; return ret; } -- 1.9.1
Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get
On 02/02/2015 14:01, Christoph Hellwig wrote: This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on module removal (and individual device removal) and dc4515ea (scsi: always increment reference count). We now never call scsi_device_get from the shutdown path, and the fact that we started grabbing reference there in commit 85b6c7 turned out turned out to create more problems than it solves, and required workarounds for workarounds for workarounds. Move back to properly checking the device state and carefully handle module refcounting. Signed-off-by: Christoph Hellwig h...@lst.de --- drivers/scsi/scsi.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 9b38299..9b7fd0b 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode); * Description: Gets a reference to the scsi_device and increments the use count * of the underlying LLDD module. You must hold host_lock of the * parent Scsi_Host or already have a reference when calling this. + * + * This will fail if a device is deleted or cancelled, or when the LLD module + * is in the process of being unloaded. */ int scsi_device_get(struct scsi_device *sdev) { - if (sdev-sdev_state == SDEV_DEL) - return -ENXIO; + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL) + goto fail; if (!get_device(sdev-sdev_gendev)) - return -ENXIO; - /* We can fail try_module_get if we're doing SCSI operations - * from module exit (like cache flush) */ - __module_get(sdev-host-hostt-module); - + goto fail; + if (!try_module_get(sdev-host-hostt-module)) + goto fail_put_device; return 0; + +fail_put_device: + put_device(sdev-sdev_gendev); +fail: + return -ENXIO; } EXPORT_SYMBOL(scsi_device_get); Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get
On Thu, 2015-01-29 at 00:00 +0100, Christoph Hellwig wrote: This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on module removal (and individual device removal) and dc4515ea (scsi: always increment reference count). We now never call scsi_device_get from the shutdown path, and the fact that we started grabbing reference there in commit 85b6c7 turned out turned out to create more problems than it solves, and required workarounds for workarounds for workarounds. Move back to properly checking the device state and carefully handle module refcounting. Signed-off-by: Christoph Hellwig h...@lst.de --- drivers/scsi/scsi.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 9b38299..95f0293 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -982,15 +982,18 @@ EXPORT_SYMBOL(scsi_report_opcode); */ int scsi_device_get(struct scsi_device *sdev) { - if (sdev-sdev_state == SDEV_DEL) - return -ENXIO; + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL) + goto fail; if (!get_device(sdev-sdev_gendev)) - return -ENXIO; - /* We can fail try_module_get if we're doing SCSI operations - * from module exit (like cache flush) */ - __module_get(sdev-host-hostt-module); - + goto fail; + if (!try_module_get(sdev-host-hostt-module)) + goto fail_put_device; return 0; + +fail_put_device: + put_device(sdev-sdev_gendev); +fail: + return -ENXIO; } EXPORT_SYMBOL(scsi_device_get); If we can get away with this, I'm all for this approach. However, you need to document in a comment or above the function that it may not be called in module exit functions and why. Other than the comment issue, the series looks good, Thanks, James