Re: Debugging scsi abort handling ?
On 08/29/2014 06:39 AM, Finn Thain wrote: On Thu, 28 Aug 2014, Hannes Reinecke wrote: What might happen, though, that the command is already dead and gone by the time you're calling -scsi_done() (if you call it after eh_abort). So there might not _be_ a command upon which you can call -scsi_done() to start with. Hence any LLDD need to clear up any internal references after a call to eh_XXX to ensure it doesn't call -scsi_done() an in invalid command. So even if the LLDD returns 'FAILED' upon a call to eh_XXX it _still_ needs to clear up the internal reference. This is a question that has been bothering me too. If the host's eh_abort_cmd() method returns FAILED, it seems the mid-layer is liable to re-issue the same command to the LLD (?) No. FAILED for any eh_abort_cmd() means that the TMF hasn't been sent. So the midlayer escalates to the next EH step. The command will only ever be re-issued once EH completes. Either that or return 'FAILED' for any later eh_XXX function until the internal references can be cleared up. So if a command may or may not exist after eh_abort_handler() returns control to the mid-layer (regardless of SUCCESS or FAILURE), then the LLD has to be careful about keeping track of which commands were aborted, if those commands are still in the process of cleanup when eh_abort_handler() returns. Yes. It's hard to see how that can work when command pointers are only unique while a command exists. Which is why we have the EH callbacks, to give the LLDD a chance to clean up internal references. In effect, this would mean that EH functions cannot return at all, until the relevant command(s) are completely forgotten by the LLD; and that means the LLD itself may have to escalate abort - device reset - bus reset - etc instead of simply returning FAILED. More often than not the LLDD has its own internal command structure, which reference the midlayer SCSI command structure via a pointer. Just clearing that pointer will do the trick. Take eg. lpfc: It'll construct its internal command here: lpfc_cmd = lpfc_get_scsi_buf(phba, ndlp); if (lpfc_cmd == NULL) { lpfc_rampdown_queue_depth(phba); lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, 0707 driver's buffer pool is empty, IO busied\n); goto out_host_busy; } /* * Store the midlayer's command structure for the * completion phase * and complete the command initialization. */ lpfc_cmd-pCmd = cmnd; lpfc_cmd-rdata = rdata; lpfc_cmd-timeout = 0; lpfc_cmd-start_time = jiffies; cmnd-host_scribble = (unsigned char *)lpfc_cmd; and then checks for the pointer upon command completion: static void lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn, struct lpfc_iocbq *pIocbOut) { struct lpfc_scsi_buf *lpfc_cmd = (struct lpfc_scsi_buf *) pIocbIn-context1; [ .. ] /* Sanity check on return of outstanding command */ if (!(lpfc_cmd-pCmd)) return; But indeed, 'FAILED' is not very meaningful here, leaving the midlayer with no information about what happened to the command. Personally I would like to enforce this meaning on the eh_XXX callbacks: - upon each eh_XXX callback the LLDD clears any internal references to the command / command scope (ie eh_abort_cmd clears the references to the command, eh_lun_reset clears all internal references to commands to this ITL nexus etc.) This happens irrespective of the return code. - The eh_XXX callback shall return 'FAILED' if the respective TMF (or equivalent) could not be initiated. - The eh_XXX callback shall return 'SUCCESS' if the respective TMF (or equvalent) could be initiated. - After each eh_XXX callback control for this command / command scope is transferred back to the midlayer; the LLDD shall not assume the associated command structures to remain valid after that point. I'm tempted to enshrine this in the documentation; that surely will help me during the EH cleanup. And Hans will have some guidelines on how to design uas EH :-) Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
On 08/29/2014 04:42 AM, Mike Christie wrote: On 08/27/2014 09:31 AM, Hannes Reinecke wrote: On 08/19/2014 07:54 PM, Christoph Hellwig wrote: On Sat, Aug 16, 2014 at 08:09:48PM -0700, K. Y. Srinivasan wrote: The host asks the guest to scan when a LUN is removed or added. The only way a guest can identify the removed LUN is when an I/O is attempted on a removed LUN - the SRB status code indicates that the LUN is invalid. We currently handle this SRB status and remove the device. Rather than waiting for an I/O to remove the device, force the discovery of LUNs that may have been removed prior to discovering LUNs that may have been added. This looks pretty reasonable to me, but I wonder if we should move this up to common code so that it happens for any host rescan triggered by sysfs or other drivers as well. Not without proper testing. Currently we cannot rescan existing devices; the inquiry string is nailed to the sdev structure. The only way to really refresh the information is to delete it and rescan it again. How are distros handling 0x6/0x3f/0x0e (report luns changed) when it gets passed to userspace? Is everyone kicking off a new full (add and delete) scan to handle this or logging it? Is the driver returning this when the LUNs change? Currently it's logged to userspace and ignored. Doing an automated rescan has proven to be dangerous, as it might disconnect any LUNs which are still in use by applications. Especially HA or database setups tends to become very annoyed when you do an automated rescan. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
On 08/29/14 08:19, Hannes Reinecke wrote: On 08/29/2014 04:42 AM, Mike Christie wrote: How are distros handling 0x6/0x3f/0x0e (report luns changed) when it gets passed to userspace? Is everyone kicking off a new full (add and delete) scan to handle this or logging it? Is the driver returning this when the LUNs change? Currently it's logged to userspace and ignored. Doing an automated rescan has proven to be dangerous, as it might disconnect any LUNs which are still in use by applications. Especially HA or database setups tends to become very annoyed when you do an automated rescan. Has it already been considered to add newly discovered LUNs automatically and to leave it to the user to remove stale LUNs manually ? That would be similar to what the rescan-scsi-bus.sh script does without option -r/--remove. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 29/08/2014 08:08, Hannes Reinecke ha scritto: No. FAILED for any eh_abort_cmd() means that the TMF hasn't been sent. So the midlayer escalates to the next EH step. The command will only ever be re-issued once EH completes. Then the answer to Hans's question is yes. It is legal to call -scsi_done() after the eh_abort handler returns FAILED. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
On 08/29/2014 09:39 AM, Bart Van Assche wrote: On 08/29/14 08:19, Hannes Reinecke wrote: On 08/29/2014 04:42 AM, Mike Christie wrote: How are distros handling 0x6/0x3f/0x0e (report luns changed) when it gets passed to userspace? Is everyone kicking off a new full (add and delete) scan to handle this or logging it? Is the driver returning this when the LUNs change? Currently it's logged to userspace and ignored. Doing an automated rescan has proven to be dangerous, as it might disconnect any LUNs which are still in use by applications. Especially HA or database setups tends to become very annoyed when you do an automated rescan. Has it already been considered to add newly discovered LUNs automatically and to leave it to the user to remove stale LUNs manually ? That would be similar to what the rescan-scsi-bus.sh script does without option -r/--remove. As of now we're still missing an in-kernel infrastructure which would allow us to react on any sense codes; currently we're relying on the administrator to setup a udev rule here. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/22] scsi logging update
On 08/28/2014 09:24 PM, Douglas Gilbert wrote: On 14-08-28 01:33 PM, Hannes Reinecke wrote: Hi all, here's my next round of scsi logging updates. Main feature is the update to have all logging statements in one line so that they won't be broken up even under high load. This will dramatically improve debugging. Additionally all printk() statements are moved to dev_printk() variants to ensure proper device tagging and keep the systemd journal happy. s/all/most/ ?? My, you are picky. Surely there are situations where a dev cannot be associated with a printk(). For example in transport discovery before any devices are found (or after, if none are found). LLDs often helpfully log their HBA's firmware details prior to discovery (and may fail before discovery). Indeed there are some printks left, eg during SCSI initialization where we don't have any device. And I didn't modify the LLDDs, which have their own logging. (But most don't use dev_printk(), neither). And it is possible to write via sysfs to a driver that has no devices attached. How does one log that? Well, I haven't come across any logging messages here, so the question has never arisen. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
On Fri, 29 Aug 2014, Hannes Reinecke wrote: On 08/29/2014 06:39 AM, Finn Thain wrote: On Thu, 28 Aug 2014, Hannes Reinecke wrote: What might happen, though, that the command is already dead and gone by the time you're calling -scsi_done() (if you call it after eh_abort). So there might not _be_ a command upon which you can call -scsi_done() to start with. Hence any LLDD need to clear up any internal references after a call to eh_XXX to ensure it doesn't call -scsi_done() an in invalid command. So even if the LLDD returns 'FAILED' upon a call to eh_XXX it _still_ needs to clear up the internal reference. This is a question that has been bothering me too. If the host's eh_abort_cmd() method returns FAILED, it seems the mid-layer is liable to re-issue the same command to the LLD (?) No. FAILED for any eh_abort_cmd() means that the TMF hasn't been sent. Makes sense, though it appears to contradict this advice about returning SUCCESS in some situations: http://marc.info/?l=linux-scsim=140923498632496w=2 The command will only ever be re-issued once EH completes. ... But indeed, 'FAILED' is not very meaningful here, leaving the midlayer with no information about what happened to the command. Personally I would like to enforce this meaning on the eh_XXX callbacks: - upon each eh_XXX callback the LLDD clears any internal references to the command / command scope (ie eh_abort_cmd clears the references to the command, eh_lun_reset clears all internal references to commands to this ITL nexus etc.) This happens irrespective of the return code. - The eh_XXX callback shall return 'FAILED' if the respective TMF (or equivalent) could not be initiated. - The eh_XXX callback shall return 'SUCCESS' if the respective TMF (or equvalent) could be initiated. - After each eh_XXX callback control for this command / command scope is transferred back to the midlayer; the LLDD shall not assume the associated command structures to remain valid after that point. Perhaps that last constraint should be relaxed to After the final EH callback (whether implemented or unimplemented by the host), command / command scope is transferred back to the midlayer... A more severe TMF is probably mandatory (e.g. bus reset) but if the driver author later added a milder one (e.g. bus device reset), your rule would mean that the existing handler would then operate under new constraints, which might cause surprises. [...] I'm tempted to enshrine this in the documentation; It is helpful, thanks. -- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/15] SCSI XCOPY support for the kernel and device mapper
Mike == Mike Snitzer snit...@redhat.com writes: Mike It would be ideal for XCOPY support to make its way upstream for Mike 3.18.. but the window for staging this work in time is closing. Mike Any chance you might have some time to review Mikulas' revised Mike approach to your initial XCOPY support? It is at the top of my list. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
On 08/29/2014 12:14 PM, Finn Thain wrote: On Fri, 29 Aug 2014, Hannes Reinecke wrote: On 08/29/2014 06:39 AM, Finn Thain wrote: On Thu, 28 Aug 2014, Hannes Reinecke wrote: What might happen, though, that the command is already dead and gone by the time you're calling -scsi_done() (if you call it after eh_abort). So there might not _be_ a command upon which you can call -scsi_done() to start with. Hence any LLDD need to clear up any internal references after a call to eh_XXX to ensure it doesn't call -scsi_done() an in invalid command. So even if the LLDD returns 'FAILED' upon a call to eh_XXX it _still_ needs to clear up the internal reference. This is a question that has been bothering me too. If the host's eh_abort_cmd() method returns FAILED, it seems the mid-layer is liable to re-issue the same command to the LLD (?) No. FAILED for any eh_abort_cmd() means that the TMF hasn't been sent. Makes sense, though it appears to contradict this advice about returning SUCCESS in some situations: http://marc.info/?l=linux-scsim=140923498632496w=2 Well, if the LLDD detects an invalid command (ie if it cannot find any internal command matching the midlayer command) that's an automatic success, obviously. So we should rephrase things to: - The eh_XXX callback shall return 'SUCCESS' if the respective TMF (or equvalent) could be initiated or if the matching command reference has already been completed by the LLDD. Otherwise the eh_XXX callback shall return 'FAILED'. The command will only ever be re-issued once EH completes. ... But indeed, 'FAILED' is not very meaningful here, leaving the midlayer with no information about what happened to the command. Personally I would like to enforce this meaning on the eh_XXX callbacks: - upon each eh_XXX callback the LLDD clears any internal references to the command / command scope (ie eh_abort_cmd clears the references to the command, eh_lun_reset clears all internal references to commands to this ITL nexus etc.) This happens irrespective of the return code. - The eh_XXX callback shall return 'FAILED' if the respective TMF (or equivalent) could not be initiated. - The eh_XXX callback shall return 'SUCCESS' if the respective TMF (or equvalent) could be initiated. - After each eh_XXX callback control for this command / command scope is transferred back to the midlayer; the LLDD shall not assume the associated command structures to remain valid after that point. Perhaps that last constraint should be relaxed to After the final EH callback (whether implemented or unimplemented by the host), command / command scope is transferred back to the midlayer... No, that's wrong. By the time any eh_XXX callbacks are triggered control _is_ already back at the midlayer. IE the command timeout triggered and the block layer already set the REQ_ATOM_COMPLETED flag, short-circuiting any attempts to call -scsi_done(). So with the callbacks the midlayer actually informs the LLDD about a certain fact; there is nothing the LLDD can do to change ownership at that point. (Correction: During the call of any eh_XXX callbacks control _is_ back at the LLDD, otherwise the callbacks would be pointless. It's just that the LLDD shouldn't assume the command is valid _after_ any of the eh_XXX callbacks has terminated.) A more severe TMF is probably mandatory (e.g. bus reset) but if the driver author later added a milder one (e.g. bus device reset), your rule would mean that the existing handler would then operate under new constraints, which might cause surprises. Well, _if_ we were to adopt this rule we obviously have to audit existing LLDDs if the rule is followed, and tweak them if not. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Hi, On 08/29/2014 12:30 PM, Hannes Reinecke wrote: On 08/29/2014 12:14 PM, Finn Thain wrote: On Fri, 29 Aug 2014, Hannes Reinecke wrote: On 08/29/2014 06:39 AM, Finn Thain wrote: On Thu, 28 Aug 2014, Hannes Reinecke wrote: What might happen, though, that the command is already dead and gone by the time you're calling -scsi_done() (if you call it after eh_abort). So there might not _be_ a command upon which you can call -scsi_done() to start with. Hence any LLDD need to clear up any internal references after a call to eh_XXX to ensure it doesn't call -scsi_done() an in invalid command. So even if the LLDD returns 'FAILED' upon a call to eh_XXX it _still_ needs to clear up the internal reference. This is a question that has been bothering me too. If the host's eh_abort_cmd() method returns FAILED, it seems the mid-layer is liable to re-issue the same command to the LLD (?) No. FAILED for any eh_abort_cmd() means that the TMF hasn't been sent. Makes sense, though it appears to contradict this advice about returning SUCCESS in some situations: http://marc.info/?l=linux-scsim=140923498632496w=2 Well, if the LLDD detects an invalid command (ie if it cannot find any internal command matching the midlayer command) that's an automatic success, obviously. So we should rephrase things to: - The eh_XXX callback shall return 'SUCCESS' if the respective TMF (or equvalent) could be initiated or if the matching command reference has already been completed by the LLDD. Otherwise the eh_XXX callback shall return 'FAILED'. Your talking about could be initiated, so that means that at this point the abort does not yet have to be completed, do I get that right? What should the LLDD then do when the abort finishes, call eh_scsi_done on the cmnd ? What about the abort never finishing (timeout), does the mid layer track this, or should the LLDD do that? Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
On Fri, 2014-08-29 at 10:13 +0200, Hannes Reinecke wrote: On 08/29/2014 09:39 AM, Bart Van Assche wrote: On 08/29/14 08:19, Hannes Reinecke wrote: On 08/29/2014 04:42 AM, Mike Christie wrote: How are distros handling 0x6/0x3f/0x0e (report luns changed) when it gets passed to userspace? Is everyone kicking off a new full (add and delete) scan to handle this or logging it? Is the driver returning this when the LUNs change? Currently it's logged to userspace and ignored. Doing an automated rescan has proven to be dangerous, as it might disconnect any LUNs which are still in use by applications. Especially HA or database setups tends to become very annoyed when you do an automated rescan. Has it already been considered to add newly discovered LUNs automatically and to leave it to the user to remove stale LUNs manually ? That would be similar to what the rescan-scsi-bus.sh script does without option -r/--remove. As of now we're still missing an in-kernel infrastructure which would allow us to react on any sense codes; currently we're relying on the administrator to setup a udev rule here. Um, I thought this was supposed to solve that problem: commit 279afdfe78a020b4b1a68bffd0009b961b12982e Author: Ewan D. Milne emi...@redhat.com Date: Thu Aug 8 15:07:48 2013 -0400 [SCSI] Generate uevents on certain unit attention codes The idea was supposed to be that, as you say, log scrubbers are hard to configure and break every time someone fixes a spelling error, so we could now listen for a report luns data change uevent instead. James
Re: [PATCH 2/2] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
On Thu, 2014-08-28 at 21:42 -0500, Mike Christie wrote: On 08/27/2014 09:31 AM, Hannes Reinecke wrote: On 08/19/2014 07:54 PM, Christoph Hellwig wrote: On Sat, Aug 16, 2014 at 08:09:48PM -0700, K. Y. Srinivasan wrote: The host asks the guest to scan when a LUN is removed or added. The only way a guest can identify the removed LUN is when an I/O is attempted on a removed LUN - the SRB status code indicates that the LUN is invalid. We currently handle this SRB status and remove the device. Rather than waiting for an I/O to remove the device, force the discovery of LUNs that may have been removed prior to discovering LUNs that may have been added. This looks pretty reasonable to me, but I wonder if we should move this up to common code so that it happens for any host rescan triggered by sysfs or other drivers as well. Not without proper testing. Currently we cannot rescan existing devices; the inquiry string is nailed to the sdev structure. The only way to really refresh the information is to delete it and rescan it again. How are distros handling 0x6/0x3f/0x0e (report luns changed) when it gets passed to userspace? Is everyone kicking off a new full (add and delete) scan to handle this or logging it? Is the driver returning this when the LUNs change? Currently the udev rules we have to handle these events are installed with a separate package, and only the REPORTED LUNS DATA HAS CHANGED does anything, the others are commented out. It turns out that e.g. multipath stops using a path if it notices that the capacity has changed and we need to do some more work there, it is under discussion. We do not delete LUNs that disappear from the REPORT LUNS inventory, although someone could write their own udev rule to do that if desired. Beware the case where a LUN is remapped to a different LUN number, or if LUN's WWID is used for a device with different data (e.g. a LUN deleted and re-added and the WWID is the same although I don't know if this actually happens). Consider that the UA just provides notification to userspace of a change -- lack of notification does not prevent someone from deciding to rescan for new LUNs via sysfs any time they feel like it. So you can't just change the storage configuration and hope that no-one notices until you are done making changes. -Ewan Also is the driver getting a 0x5/0x25/0 (invalid LUN) when the LUN does not exist, or is it just getting that SRB_STATUS_INVALID_LUN error code? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] kexec: Export kexec_in_progress
On 08/04/2014 09:21 AM, Brian King wrote: On 07/28/2014 03:28 PM, Brian King wrote: Export kexec_in_progress for use by device drivers and other modules to optimize kexec boot. Signed-off-by: Brian King brk...@linux.vnet.ibm.com --- kernel/kexec.c |2 ++ 1 file changed, 2 insertions(+) diff -puN kernel/kexec.c~kexec_export_in_prog kernel/kexec.c --- linux/kernel/kexec.c~kexec_export_in_prog2014-07-23 17:05:24.851887935 -0500 +++ linux-bjking1/kernel/kexec.c 2014-07-23 17:05:24.856887970 -0500 @@ -1716,3 +1716,5 @@ int kernel_kexec(void) mutex_unlock(kexec_mutex); return error; } + +EXPORT_SYMBOL_GPL(kexec_in_progress); Eric, Can I get an ack on this so we can take this entire series through the SCSI tree? Eric, Any issues with this patch? Thanks, Brian -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Buffer I/O error after s2ram with usb storage
Le Wed, 27 Aug 2014 10:54:53 -0400, Alan Stern st...@rowland.harvard.edu a écrit : On Wed, 27 Aug 2014, Matthieu CASTET wrote: Ping I have got also a problem with a usb sdcard reader (without power cut during suspend) The usb storage driver call scsi_report_bus_reset after device reset, but because of commit dfcf7775 4, we don't ignore unit attention if sshdr.asc == 0x28 sshdr.ascq == 0x00 (Not-ready to ready). If dfcf7775 is reverted there is no more Buffer I/O error. Is that possible to find a way to restore the behavior before dfcf7775 commit (no Buffer I/O error after device reset) after a suspend to ram ? Since that commit was written to fix a problem with certain cdrom drives, maybe we would work around the issue by modifying the commit. Have it go back to the original behavior if the device isn't a cdrom drive. That's not a complete fix (it won't help when a CD drive is attached via USB), but maybe it's better than nothing. Ok, note to handle all case we need also to filter unit_attention in scsi_test_unit_ready. Otherwise DISK_EVENT_MEDIA_CHANGE event is set and check_disk_change will invalidate vfs cache. Matthieu diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 2bc0362..e994061 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2030,8 +2030,12 @@ scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries, result = scsi_execute_req(sdev, cmd, DMA_NONE, NULL, 0, sshdr, timeout, retries, NULL); if (sdev-removable scsi_sense_valid(sshdr) - sshdr-sense_key == UNIT_ATTENTION) - sdev-changed = 1; + sshdr-sense_key == UNIT_ATTENTION) { + if (sdev-expecting_cc_ua) + sdev-expecting_cc_ua = 0; + else + sdev-changed = 1; + } } while (scsi_sense_valid(sshdr) sshdr-sense_key == UNIT_ATTENTION --retries); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bnx2fc: fix incorrect DMA memory mapping in bnx2fc_map_sg()
Chad, can you send out your last version with a proper changelog, signoff, and the ack from Eddie included? Also can you prioritize getting the shared skb patch tested? Thanks, Christoph -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with 2TB HDDs)
From: Alan Stern st...@rowland.harvard.edu If you try to repartition the drive under Windows using the deficient adapter, you'll see that the problem still exists. It just doesn't show up during normal use. So in summary, the Windows workaround is icky, but it allows any use but repartitioning to be one on the attached disk. Dale -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with 2TB HDDs)
Is there an 'easy' way to override the detected size of a storage device from userspace? If we had that, someone could write a helper application which looked for this particular fubar and try to Do The Right Thing(tm), or at least offer the user some options. Matt On Fri, Aug 29, 2014 at 2:07 PM, Dale R. Worley wor...@alum.mit.edu wrote: From: Alan Stern st...@rowland.harvard.edu If you try to repartition the drive under Windows using the deficient adapter, you'll see that the problem still exists. It just doesn't show up during normal use. So in summary, the Windows workaround is icky, but it allows any use but repartitioning to be one on the attached disk. Dale -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Matthew Dharm Maintainer, USB Mass Storage driver for Linux -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html