On 12/7/2012 3:20 PM, Mike Christie wrote:
> On 12/07/2012 03:05 PM, Jeremy Linton wrote:
>> That said, its far from perfect. The code (as I understand it) isn't 
>> differentiating between isolating the failure, or bringing out the big 
>> hammer in an attempt to correct problems on a specific I_T_L. If you 
>> drop/reset the I_T because one of the LUN's is misbehaving before
>> verifying the status of other LUN's on the target, you risk interrupting
>> operations to functional devices.
> 
> When this code is called the scsi eh has run the abort handler for each 
> outstanding command and that has failed, and it has run the lun/device 
> reset handler and that has failed (or the eh operations succeeded but the
> TUR checkup the scsi eh does failed).

        I think my issue with the error handler (rather than this patch in
particular) surrounds the fact that when scsi_eh_bus_device_reset (which maps
to lun reset) fails, it falls to scsi_eh_target_reset which issues a TARGET
RESET which then broadens the problem to devices which may be working fine,
and just happen to be on the same I_T.

        I think there should be some attempt to determine if there are other 
devices on
the I_T, and whether they have failed before going into target_reset. It looks
like there may have been a plan to do that in bus_device_reset, but it doesn't
appear to be complete.


        Now, all that said, I have a few things I wonder about in the
eh_bus_device_reset code. For one the use of TUR rather than a command with a
more straightforward return status like INQUIRY which also preserves the check
conditions.




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to