On Mon, Aug 3, 2009 at 10:36 PM, Roland Dreier <[email protected]> wrote: > > > Issuing a SCSI reset command on an SRP initiator after the SRP connection > has > > been closed triggers a NULL pointer dereference. The patch below fixes this > > NULL pointer dereference. > > > > See also http://bugzilla.kernel.org/show_bug.cgi?id=13893. > > Thanks for debugging this... a couple of questions: > > > + BUG_ON(!req->scmnd->device); > > Why BUG_ON() here? Can we return failure or something, rather than > crashing the whole system?
The function srp_send_tsk_mgmt() contains a.o. the following statement: "tsk_mgmt->lun = cpu_to_be64((u64) req->scmnd->device->lun << 48);". This is the statement that triggered the NULL pointer dereference. Whether or not a BUG_ON() is appropriate here depends on which of the following two alternatives is preferred: should the caller guarantee that req->scmnd->device != NULL or should srp_send_tsk_mgmt() should handle the condition req->scmnd->device == NULL itself ? > > + if (!req->scmnd->device) > > + return FAILED; > > How do we end up in srp_reset_device() with req->scmnd->device == NULL? > Presumably req->scmnd should match scmnd if I am understanding the code > properly -- and then scmnd->device == NULL?? Good question. I did not yet analyze why this happens. But before I started developing a patch I had first verified that scmnd->device is NULL at that point by inserting the statement WARN_ON(!scmnd->device). A clue might be that without the above patch the BUG message on the initiator system is triggered just after the "SRP reset_device called" message has been logged. Bart. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
