Re: [PATCH 5/6] sd: don't start disks on system resume
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/17/2013 1:43 AM, James Bottomley wrote: -static int sd_resume(struct device *dev) +static int sd_resume_runtime(struct device *dev) { struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); int ret = 0; -if (!sdkp-device-manage_start_stop) - goto done; - This would make every disk start on resume ... even those which were spun down before suspend. What? Those are lines I removed. sd_printk(KERN_NOTICE, sdkp, Starting disk\n); + dump_stack(); I assume this is a debug left over that should be removed? Yes. ret = sd_start_stop_device(sdkp, 1); -done: scsi_disk_put(sdkp); return ret; } +#ifdef CONFIG_PM_RUNTIME I really don't like the ifdefs. The behaviour changes look pretty radical for something that's supposed to be an add on to system suspend. How is it radical? Either we wake the disk as we did before, or we use runtime pm to (try to) keep it suspended until accessed. The #ifdefs are required because it depends on runtime pm. +static void sd_resume_work(struct work_struct *work) +{ + struct scsi_disk *sdkp = container_of(work, struct scsi_disk, resume_work.work); + struct scsi_device *sdp = sdkp-device; + struct device *dev = sdkp-dev; + int res; + unsigned char cmd[10] = { 0 }; + struct scsi_sense_hdr sshdr; + const int timeout = sdp-request_queue-rq_timeout; + +cmd[0] = REQUEST_SENSE; You can't just send a random request sense because the reply will also be pretty random (either no sense or the sense of the last check condition depending on how the disk is feeling). To see if a disk is spun down, you need to send a test unit ready (TUR). - From what I can see in the standards, this is not true. TEST UNIT READY will give an error response if the drive requires START STOP UNIT, but if it does not, either because it is actually an ata disk, or because it auto starts ( the standard indicates scsi disks can be configured to do this, but does not say how ), and is simply in standby, then it will return success. In other words, ata drives always return ready and scsi disks do as well, if they are in standby mode rather than stopped mode. On the other hand, there are several references in the standards to using REQUEST SENSE to poll the drive status rather than to read sense data for the last command, and I believe there was a passage somewhere that mentioned the last command status is cleared once it has been read once, presumably so that a subsequent REQUEST SENSE will instead get the current status of the drive. SAT-3 specifies that REQUEST SENSE is to be translated to an ata CHECK POWER CONDITION specifically so that it can be used to poll the drive to see if it is currently in standby. You can use sdparm --command=sense to issue a REQUEST SENSE command to a drive and see for yourself. If you configure the drive to auto suspend using the timer, then poll it, you should see the response indicating it is in standby. See SPC-4 section 5.6 and SAT-3 section 8.10. When you send the TUR, you only test the sense if check condition was returned. Plus, I think you only want to start the disk if it sends back an ASC/ASCQ saying initialising command is required, don't you? REQUEST SENSE gives GOOD rather than CHECK CONDITION. You want to NOT start the disk if you get back that status. The whole point is to let the disk stay asleep. The idea is to force the disk into runtime suspend so it can sleep, and be started when accessed, but if it is already up and running ( as most ata disks will be ), then don't pretend it is suspended since it really isn't. OK, I think I'm lost here ... why are we finalising suspend in the resume thread? Because we want to let runtime pm start the disk at a later time, once it is accessed, rather than when the system resumes. +execute_in_process_context(sd_resume_work, + sdkp-resume_work); Why is this necessary ... I thought resume was run on a thread (so it can sleep)? Because we don't want to block the resume path for many seconds and delay user space unfreezing. I actually need to fix that to unconditionally queues the work item since right now it is in process context, so it just directly calls the other function. +err: + scsi_disk_put(sdkp); + return ret; +#else +if (sdkp-device-manage_start_stop) + return sd_resume_runtime(dev); +return 0; +#endif This doesn't really look right: if we have runtime resume enabled, the system resume will always spin up; if we don't, it will conditionally spin up? Other way around; if we *don't* have runtime pm configured, then system resume always spins up, and if we do, then we try to delay spinning up until the disk is accessed, but check to see if the drive has spun up on its own. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
Re: [PATCH 5/6] sd: don't start disks on system resume
On Tue, 2013-12-17 at 10:01 -0500, Phillip Susi wrote: On 12/17/2013 1:43 AM, James Bottomley wrote: -static int sd_resume(struct device *dev) +static int sd_resume_runtime(struct device *dev) { struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); int ret = 0; - if (!sdkp-device-manage_start_stop) - goto done; - This would make every disk start on resume ... even those which were spun down before suspend. What? Those are lines I removed. Yes, they say that in sd_resume, if we're managing the disk start stop, spin it up. Removing them spins everything up unconditionally. sd_printk(KERN_NOTICE, sdkp, Starting disk\n); + dump_stack(); I assume this is a debug left over that should be removed? Yes. ret = sd_start_stop_device(sdkp, 1); -done: scsi_disk_put(sdkp); return ret; } +#ifdef CONFIG_PM_RUNTIME I really don't like the ifdefs. The behaviour changes look pretty radical for something that's supposed to be an add on to system suspend. How is it radical? Either we wake the disk as we did before, or we use runtime pm to (try to) keep it suspended until accessed. The #ifdefs are required because it depends on runtime pm. Good systems engineering practise does not have different behaviours for code with and without ifdefs. You want to be adding incremental behaviour not modifying existing. The reason is testing; now you have to build two different kernels to test the two behaviours and mostly one path gets selected by every distro and the other bitrots. +static void sd_resume_work(struct work_struct *work) +{ + struct scsi_disk *sdkp = container_of(work, struct scsi_disk, resume_work.work); + struct scsi_device *sdp = sdkp-device; + struct device *dev = sdkp-dev; + int res; + unsigned char cmd[10] = { 0 }; + struct scsi_sense_hdr sshdr; + const int timeout = sdp-request_queue-rq_timeout; + + cmd[0] = REQUEST_SENSE; You can't just send a random request sense because the reply will also be pretty random (either no sense or the sense of the last check condition depending on how the disk is feeling). To see if a disk is spun down, you need to send a test unit ready (TUR). From what I can see in the standards, this is not true. TEST UNIT READY will give an error response if the drive requires START STOP UNIT, but if it does not, either because it is actually an ata disk, or because it auto starts ( the standard indicates scsi disks can be configured to do this, but does not say how ), and is simply in standby, then it will return success. Are you confused about the power states? in SCSI, stand by and idle are different from spun down (or stopped). Right at the moment, start stop manages spin down/spin up not the granular power states for almost every device. The only subsystem that actually uses power state management is firewire. Even if you enable power state management for ATA, we have to cope with non-ATA devices. The standards actually say that SCSI devices automatically come out of standby power conditions unless the transport protocol says something different. In other words, ata drives always return ready and scsi disks do as well, if they are in standby mode rather than stopped mode. Yes, but they aren't going to be in standby mode apart from firewire devices; they're going to be stopped. On the other hand, there are several references in the standards to using REQUEST SENSE to poll the drive status rather than to read sense data for the last command, and I believe there was a passage somewhere that mentioned the last command status is cleared once it has been read once, presumably so that a subsequent REQUEST SENSE will instead get the current status of the drive. SAT-3 specifies that REQUEST SENSE is to be translated to an ata CHECK POWER CONDITION specifically so that it can be used to poll the drive to see if it is currently in standby. You can use sdparm --command=sense to issue a REQUEST SENSE command to a drive and see for yourself. If you configure the drive to auto suspend using the timer, then poll it, you should see the response indicating it is in standby. See SPC-4 section 5.6 That's self test operations. I assume you mean 6.29 REQUEST SENSE? and SAT-3 section 8.10. What SAT says is irrelevant. That's about how to translate ATA to SCSI, not how SCSI devices should behave. When you send the TUR, you only test the sense if check condition was returned. Plus, I think you only want to start the disk if it sends back an ASC/ASCQ saying initialising command is required, don't you? REQUEST SENSE gives GOOD rather than CHECK CONDITION. You want to NOT start the disk if you get back that status. The whole point is to let the disk stay asleep. The idea is to force the disk into runtime suspend so it can sleep, and be started when accessed, but if it is already up and running ( as most ata disks will be ), then
Re: [PATCH 5/6] sd: don't start disks on system resume
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/17/2013 12:48 PM, James Bottomley wrote: Yes, they say that in sd_resume, if we're managing the disk start stop, spin it up. Removing them spins everything up unconditionally. Ahh, right, I got rid of the manage_start_stop check. Probably shouldn't be in this patch, but I do think this flag should be removed as we shouldn't be putting extra wear and tear on the disk by letting it take an emergency head retract on power off. Good systems engineering practise does not have different behaviours for code with and without ifdefs. You want to be adding incremental behaviour not modifying existing. The reason is testing; now you have to build two different kernels to test the two behaviours and mostly one path gets selected by every distro and the other bitrots. Unfortunately, the scsi error handling code is terrible so my attempt to use that was not acceptable, so to do it requires runtime pm. Are you confused about the power states? in SCSI, stand by and idle are different from spun down (or stopped). Right at the moment, start stop manages spin down/spin up not the granular power states for almost every device. The only subsystem that actually uses power state management is firewire. Even if you enable power state management for ATA, we have to cope with non-ATA devices. Yes, and ata drives do not have a stopped state. They always power on in either active or standby. The SCSI standards even mention that SCSI disks may be configured ( but don't say how ) to power on in active rather than stopped. The standards actually say that SCSI devices automatically come out of standby power conditions unless the transport protocol says something different. Yes, so when in standby the drive does not require START UNIT, but if we set the runtime status to suspended, then we will issue one as soon as a request comes in, causing the drive to spin up. We don't want to do that since some commands don't need the media to be spinning. My goal was for the runtime pm status to reflect the drive status, so if the drive is in standby, even though it doesn't need started, the runtime status should show it is suspended. I suppose that if I didn't care about that, I could use TEST UNIT READY and only runtime suspend if it requires the START UNIT command. Yes, but they aren't going to be in standby mode apart from firewire devices; they're going to be stopped. No, because again, ata has no stopped state. ata disks will either be active, or standby if they have power up in standby enabled. What SAT says is irrelevant. That's about how to translate ATA to SCSI, not how SCSI devices should behave. It is relevant because they wouldn't have said to translate it that way if they didn't intend REQUEST SENSE to be used that way. If it was the way you say, then it would say that the TEST UNIT READY command should be translated to CHECK POWER, rather than REQUEST SENSE. Of course, SPC is the more authoritative source, which also confirms it. Because we want to let runtime pm start the disk at a later time, once it is accessed, rather than when the system resumes. I still don't get why a resume function call is doing suspend operations. Because runtime pm only resumes it if it is runtime suspended... if we don't start the drive in the system resume handler, then we need to block any requests until it is started, which is exactly what being runtime suspended does. Because we don't want to block the resume path for many seconds and delay user space unfreezing. I actually need to fix that to unconditionally queues the work item since right now it is in process context, so it just directly calls the other function. Um, then you need to read the code. execute_in_process_context() is a direct function call if we have process context, so it's not going to use a work queue. That's what I just said. Please don't do this. Significant behaviour changes depending on what options are selected are not a good idea from the systems point of view. Options should control incremental changes (function present or not present). Well, that is what it is doing. The function of remaining suspended after system resume is either present or not. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSsJhEAAoJEI5FoCIzSKrwFg0H/3xMLeKAA3ne7dTGfsDJu/E1 MQ9GBNgp+A/b08y8etJF2rDXNbWt0XqPWJnZWNJ/ubirShSVbQrTwIIatJ8qumG5 g6Q3nAmL7LtuqBFKwJJbo9/b6PdlCSXvkD5m4Dz4ZiCMHJQxD91n4m/iEtTXG/Cr /xxdZwPqd1QQ8fOm573QeSeilIsoqDH51HSdaHx8c+C9bubBSAIiWtXDPRac8zLL X7fIDJFyLtv/SoygCOrKuyalfjlDsbxl44qGT9mAgLJyi6Dj+StYbXvDwTtRP360 ZIlYpphLt0ybqFr588rluEu6PgI2V+93ukvmrY8YliaoyKiSP1G5c6Q48kk25Nw= =BE0e -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH 5/6] sd: don't start disks on system resume
Instead of forcing a disk to start up with the START STOP UNIT command when the system resumes, let it stay asleep if runtime pm is enabled, and it will start the drive when it is accessed. Query the drive to see if it starts up on its own ( like most ATA disks do ) and update the runtime pm status if so. --- drivers/scsi/sd.c | 94 +-- drivers/scsi/sd.h | 1 + 2 files changed, 86 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e6c4bff..98c082a 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -107,7 +107,8 @@ static int sd_remove(struct device *); static void sd_shutdown(struct device *); static int sd_suspend_system(struct device *); static int sd_suspend_runtime(struct device *); -static int sd_resume(struct device *); +static int sd_resume_system(struct device *); +static int sd_resume_runtime(struct device *dev); static void sd_rescan(struct device *); static int sd_done(struct scsi_cmnd *); static int sd_eh_action(struct scsi_cmnd *, unsigned char *, int, int); @@ -486,11 +487,11 @@ static struct class sd_disk_class = { static const struct dev_pm_ops sd_pm_ops = { .suspend= sd_suspend_system, - .resume = sd_resume, + .resume = sd_resume_system, .poweroff = sd_suspend_system, - .restore= sd_resume, + .restore= sd_resume_system, .runtime_suspend= sd_suspend_runtime, - .runtime_resume = sd_resume, + .runtime_resume = sd_resume_runtime, }; static struct scsi_driver sd_template = { @@ -3166,22 +3167,97 @@ static int sd_suspend_runtime(struct device *dev) return sd_suspend_common(dev, false); } -static int sd_resume(struct device *dev) +static int sd_resume_runtime(struct device *dev) { struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); int ret = 0; - if (!sdkp-device-manage_start_stop) - goto done; - sd_printk(KERN_NOTICE, sdkp, Starting disk\n); + dump_stack(); ret = sd_start_stop_device(sdkp, 1); -done: scsi_disk_put(sdkp); return ret; } +#ifdef CONFIG_PM_RUNTIME +static void sd_resume_work(struct work_struct *work) +{ + struct scsi_disk *sdkp = container_of(work, struct scsi_disk, resume_work.work); + struct scsi_device *sdp = sdkp-device; + struct device *dev = sdkp-dev; + int res; + unsigned char cmd[10] = { 0 }; + struct scsi_sense_hdr sshdr; + const int timeout = sdp-request_queue-rq_timeout; + + cmd[0] = REQUEST_SENSE; + printk(KERN_NOTICE Requesting sense\n); + res = scsi_execute_req_flags(sdp, cmd, DMA_NONE, NULL, 0, +sshdr, timeout, SD_MAX_RETRIES, +NULL, REQ_PM); + if (res) { + sd_print_result(sdkp, res); + goto wakeup; + } + if (driver_byte(res) DRIVER_SENSE) + sd_print_sense_hdr(sdkp, sshdr); + /* we need to evaluate the error return */ + if (scsi_sense_valid(sshdr) + sshdr.sense_key == 0 + sshdr.asc == 0 + sshdr.ascq == 0) + goto wakeup; + + /* finalize suspend */ + printk(KERN_NOTICE Finalizing suspend\n); + pm_runtime_disable(dev); + pm_runtime_set_suspended(dev); + pm_runtime_enable(dev); + blk_post_runtime_suspend(sdp-request_queue, 0); + scsi_disk_put(sdkp); + return; + +wakeup: + /* abort suspend */ + sd_printk(KERN_NOTICE, sdkp, Activating disk %s\n, kobject_name(dev-kobj)); + blk_post_runtime_suspend(sdp-request_queue, -EBUSY); + scsi_disk_put(sdkp); +} +#endif + + +static int sd_resume_system(struct device *dev) +{ + struct scsi_disk *sdkp; +#ifdef CONFIG_PM_RUNTIME + struct scsi_device *sdp; + int ret = 0; + + sdkp = scsi_disk_get_from_dev(dev); + sdp = sdkp-device; + if (!scsi_device_online(sdp)) + { + ret = -ENODEV; + goto err; + } + sd_printk(KERN_NOTICE, sdkp, Suspending disk\n); + /* force runtime pm status to suspending */ + blk_pre_runtime_suspend(sdp-request_queue); + execute_in_process_context(sd_resume_work, + sdkp-resume_work); + +err: + scsi_disk_put(sdkp); + return ret; +#else + if (sdkp-device-manage_start_stop) + return sd_resume_runtime(dev); + return 0; +#endif +} + + /** * init_sd - entry point for this driver (both when built in or when * a module). diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 26895ff..0b8afb3 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -90,6 +90,7 @@ struct scsi_disk { unsignedlbpvpd : 1; unsignedws10 :
Re: [PATCH 5/6] sd: don't start disks on system resume
On Mon, 2013-12-16 at 18:30 -0500, Phillip Susi wrote: Instead of forcing a disk to start up with the START STOP UNIT command when the system resumes, let it stay asleep if runtime pm is enabled, and it will start the drive when it is accessed. Query the drive to see if it starts up on its own ( like most ATA disks do ) and update the runtime pm status if so. --- drivers/scsi/sd.c | 94 +-- drivers/scsi/sd.h | 1 + 2 files changed, 86 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e6c4bff..98c082a 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -107,7 +107,8 @@ static int sd_remove(struct device *); static void sd_shutdown(struct device *); static int sd_suspend_system(struct device *); static int sd_suspend_runtime(struct device *); -static int sd_resume(struct device *); +static int sd_resume_system(struct device *); +static int sd_resume_runtime(struct device *dev); static void sd_rescan(struct device *); static int sd_done(struct scsi_cmnd *); static int sd_eh_action(struct scsi_cmnd *, unsigned char *, int, int); @@ -486,11 +487,11 @@ static struct class sd_disk_class = { static const struct dev_pm_ops sd_pm_ops = { .suspend= sd_suspend_system, - .resume = sd_resume, + .resume = sd_resume_system, .poweroff = sd_suspend_system, - .restore= sd_resume, + .restore= sd_resume_system, .runtime_suspend= sd_suspend_runtime, - .runtime_resume = sd_resume, + .runtime_resume = sd_resume_runtime, }; static struct scsi_driver sd_template = { @@ -3166,22 +3167,97 @@ static int sd_suspend_runtime(struct device *dev) return sd_suspend_common(dev, false); } -static int sd_resume(struct device *dev) +static int sd_resume_runtime(struct device *dev) { struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); int ret = 0; - if (!sdkp-device-manage_start_stop) - goto done; - This would make every disk start on resume ... even those which were spun down before suspend. sd_printk(KERN_NOTICE, sdkp, Starting disk\n); + dump_stack(); I assume this is a debug left over that should be removed? ret = sd_start_stop_device(sdkp, 1); -done: scsi_disk_put(sdkp); return ret; } +#ifdef CONFIG_PM_RUNTIME I really don't like the ifdefs. The behaviour changes look pretty radical for something that's supposed to be an add on to system suspend. +static void sd_resume_work(struct work_struct *work) +{ + struct scsi_disk *sdkp = container_of(work, struct scsi_disk, resume_work.work); + struct scsi_device *sdp = sdkp-device; + struct device *dev = sdkp-dev; + int res; + unsigned char cmd[10] = { 0 }; + struct scsi_sense_hdr sshdr; + const int timeout = sdp-request_queue-rq_timeout; + + cmd[0] = REQUEST_SENSE; You can't just send a random request sense because the reply will also be pretty random (either no sense or the sense of the last check condition depending on how the disk is feeling). To see if a disk is spun down, you need to send a test unit ready (TUR). + printk(KERN_NOTICE Requesting sense\n); + res = scsi_execute_req_flags(sdp, cmd, DMA_NONE, NULL, 0, + sshdr, timeout, SD_MAX_RETRIES, + NULL, REQ_PM); + if (res) { + sd_print_result(sdkp, res); + goto wakeup; + } + if (driver_byte(res) DRIVER_SENSE) + sd_print_sense_hdr(sdkp, sshdr); + /* we need to evaluate the error return */ + if (scsi_sense_valid(sshdr) + sshdr.sense_key == 0 + sshdr.asc == 0 + sshdr.ascq == 0) When you send the TUR, you only test the sense if check condition was returned. Plus, I think you only want to start the disk if it sends back an ASC/ASCQ saying initialising command is required, don't you? + goto wakeup; + + /* finalize suspend */ + printk(KERN_NOTICE Finalizing suspend\n); OK, I think I'm lost here ... why are we finalising suspend in the resume thread? + pm_runtime_disable(dev); + pm_runtime_set_suspended(dev); + pm_runtime_enable(dev); + blk_post_runtime_suspend(sdp-request_queue, 0); + scsi_disk_put(sdkp); + return; + +wakeup: + /* abort suspend */ + sd_printk(KERN_NOTICE, sdkp, Activating disk %s\n, kobject_name(dev-kobj)); + blk_post_runtime_suspend(sdp-request_queue, -EBUSY); + scsi_disk_put(sdkp); +} +#endif + + +static int sd_resume_system(struct device *dev) +{ + struct scsi_disk *sdkp; +#ifdef CONFIG_PM_RUNTIME + struct scsi_device *sdp; + int ret = 0; + + sdkp = scsi_disk_get_from_dev(dev); + sdp =