Roland Dreier wrote:
> > And what if you comment out the line
> > .eh_device_reset_handler = srp_reset_device,
> > does that fix it?
> No
Now I'm really confused.
Me too.
It seems we lose the connection to the target (BTW -- do you know why
the connection is getting killed)?
I reported the error from my original email responding to
your fmr patch. For ia64 system with pcix hca I got asyn
event IB_EVENT_QP_ACCESS_ERR at the initiator (and I got cqe
with IB_COMPLETION_STATUS_REMOTE_ACCESS_ERROR status at my
target)
I still have not had an IB analyzer trace (as you suggested)
So the SCSI midlayer times out commands and tries to abort them. But
we have no connection so the abort fails. The SCSI command shouldn't
get freed now (at least if I'm understanding scsi_error.c correctly).
Then we have no .eh_device_reset_handler so everything should fall
through to calling our .eh_host_reset_handler without freeing any SCSI
commands. And then we crash on a use-after-free of a SCSI command.
So where is that command getting freed on us??
The scsi command that is used by error handlers
(.eh_abort_handler, .eh_host_reset_handler...) is not the
same as use-after-free scsi command from req->scmnd
There is some glitch that the scsi command from req->scmnd
already freed by scsi midlayer; however, the request is
still in our pending request queue
Vu
_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general