Gerard,

Here is what seems to be happening.

 1) Send Write command to device X, it disconnects.
 2) Issue Test Test Unit Ready command to non-existant device Y.
 3) This times out and goes into the interrupt service routine (isr).
 4) The isr reads the status bits. Somehow the chip is enabled.
 5) Device X reselects (before the isr is finished)
 6) The isr thinks it is connected because of the error.
 7) The isr does a reset.
 8) More bad things happen.

NOTE:
To test this theory, I changed ncr_recover_scsi_int() with the following
code...

        if (scntl1 & ISCON) {
                if (hsts==HS_SEL_TIMEOUT)
                        printk( "SEL Timeout, but host is connected!?\n");
                goto reset_all;
        }

The message does get triggered.

Any clues on how to prevent the reconnect before the error handler has
done its thing? Is this a chip bug?   NOTE: where the printk statement
is, I tried to call ncr_complete() from there. That helped some, but the
middle scsi layer showed some timeout messages.

<>< Lance.


Gerard Roudier wrote:
> 
> On Wed, 21 Jul 1999, D. Lance Robinson wrote:
> 
> > Hi Gerard & others,
> >
> > Using a Symbios/LSI 53c895 chip and the sym53c8xx driver, I am trying to
> > scan the bus for newly added devices using the
> >
> >    echo "scsi add-single-device 0 0 id 0 " >/proc/scsi/scsi
> >
> > technique. This generally works on an idle bus (doesn't always see a
> > device), but bad things happen when there is activity on the bus when
> > the 'add' command is issued. A bus reset get generated when a device
> > reselects the bus. And this can happen several times when trying to
> > 'add' (probe) a non-existant device.
> >
> > Here is a scenario of what is happening (with the help of a SCSI
> > analyzer.)
> 
> Then I have the only option to trust you. ;-)
> 
> > 1) One or more commands get queued up in device X.
> > 2) The 'add-single-device' command is issued for non-existant device Y.
> > 3)   Exactly what happens now is a bit fuzzy
> > 4) Device X reselects the host, and sends the 0x80 Identify message
> > 5) The SCSI Bus is RESET.
> > 6) Loops back to 4 for zero or more times
> 
> Strange, but probably quite uncommon and unusual situation. :)
> Will be pleased to know if, at least, the thing does recover from the
> problem.
> 
> > NOTE: I am using Seagate Barracuda devices (ST39102LC) and this is on a
> > PowerPC system.
> >
> > Any ideas ?
> 
> Absolutely none.
> 
> The scenario that should happen should be that the initiator handles the
> selection timeout procedure using compliant timings and that the target
> that wants to reselect also uses compliant timings for detecting the BUS
> free phase, then rearbitrating for the BUS and then reselecting and
> sending its message.
> Also, the controller SCSI core must latch the right number of the target
> that reselected, etc..., etc..., etc...
> 
> I am interested in all the data you have on the problem (kernel messages,
> SCSI traces, other informations)
> 
> Thanks for the report.
> 
> Regards,
>    G�rard.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to