> Thank you, Matt. Then I have another question: > As we know SCSI mid-layer issue a command to LLDD by > host->hostt->queuecommand(cmd, scsi_done); and in the meantime a > timer is set. When the timer expires, SCSI mid-layer know the > execution of command has failed. > My question is: when SCSI device is surprise-removed, if SCSI > mid-layer > issue a command to this removed device, will mid-layer has to wait > a timeout before it can know the execution of command failed? Or is > there > any other mechanism that LLDD can notify mid-layer that execution of > command failed without waiting for a timeout?
What we did in the FC transport - there's a transport level timeout at the target level that controls how long we "insulate" the system from the device's disappearance. When the device is first removed, the transport has the midlayer suspend i/o (e.g block) the device, so no i/o failures, other than timeouts on in-flight i/o's occur. As the midlayer (for disk devices) typically retries i/o's, even the in-flight errors don't result in an error to the application, as the retry get's delayed due the blocked state of the device. If the device returns within the insulation period, i/o resumes, and the system continues happily along it's way. If the device does not return, the timeout fires, and the device is restarted. The i/o then reaches the LLDD, who is expected to fail the i/o immediately as the target doesn't exist. The midlayer reacts accordingly and places the device into an offline state. If the device is readded, the LLDD sets the target to a good state, but the midlayer keeps the devices in an offline state until steps are taken to bring them back online. E.g. The admin takes whatever steps are necessary to clean up the system for the previous failure of the device, then brings the device online by writing the device state to running and rescanning the device. If multipath solutions are in place, they will want to set the "insulation" timeout as low as possible so that access so that it's alternate pathing can kick in as soon as possible. The multipathing solution, upon device re-addition, is required to take the steps to bring the device back online. -- james s - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html