> 1. srp_unmap_data() and srp_remove_req() for .eh_abort_handler(scmnd) > a. abort get timeout or > b. req->cmd_done or > c. !req->tsk_status > 2. we should do step (1) for .eh_abort_handler(scmnd) only and don't > do step 1 for .eh_device_reset_handler(scmnd) since same scsi command > is used for all .eh_handler() > 3. scsi command is used in all .eh_handler() will be freed by scsi > midlayer at the end of error handling sequences > 4. If we don't do step 1, scsi command which is used in all > .eh_handler() and freed is still in our pending queue and is > referenced in srp_reconnect_target() / reinit request ring
So I finally got a chance to look at this in detail. It does look like we should remove the request in (1) if the command finishes or the abort succeeds. However if the abort times out then then command is still out there -- shouldn't we wait for the eh_device_reset_handler and then flush all matching commands there? And I don't understand (4) -- isn't srp_reconnect_target() being called from srp_reset_host() as part of the error handling sequence? Unless I'm misreading the code in scsi_error.c, commands don't get freed (assuming all aborts and device resets fail) before then. What am I missing? In your case, where the abort and device reset fail and then the host reset gets called, where was the command getting freed? Thanks, Roland _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general