On Sun, 14 Nov 1999, Robert Johannes wrote:

> These are messages I got in my "dmesg" command, and /var/log/messages. 
> Could someone who understands what this is explain it please?

Will try, but I am afraid this will not help a lot.

> Thanks in advance.
> 
> 
> Robert Johannes.
> 
> ncr53c875-0:0: ERROR (81:0) (0-a7-80) (f/9d) @ (mem 3e30170:00000000).
                         ^
- Bit (1<<0) of DSTAT indicates an Illegal Script Instruction condition
  detected by the SCSI chip.
- The current SCRIPTS instruction pointer points to memory, at address 
  3e30170 and the SCRIPTS processor fetched a 32 bit value ZERO.

If the instruction address pointed within the SCRIPTS area, the driver 
would have printed out the corresponding instruction code value.
So, if it didn't, that means that the SCRIPTS processor did jump 
out of SCRIPTS areas for some obscure reason.

Reasons could be:
- hardware memory problem
- memory allocated to SCRIPTS having been corrupted
- PCI BUS error not detected. Only the sym53c8xx driver takes care of 
  PCI parity checking being enabled from both the PCI config space and 
  chip IO register.
- memory allocated to driver data structure corrupted.
- and ... probably thousands of others ...

Obviously a driver bug or chip bug is possible, but the output let me 
guess that the driver you are using is an old version and may-be you 
should give a more recent driver a try.

> ncr53c875-0: regdump: da 10 c0 9d 47 0f 00 07 80 00 80 a7 80 00 07 09.
> ncr53c875-0: have to clear fifos.
> ncr53c875-0: restart (scsi reset).
> ncr53c875-0: copying script fragments into the on-board RAM ...
> ncr53c875-0-<0,0>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns, offset 15)

BTW, I have added some guide-lines for the understanding of driver
messages that address severe hardware errors in latest README.ncr53c8xx 
file (text included below).

What is amazing is that the error condition you report is not described
given that I thought it was very unlikely to happen and to be
understandable without a good knowledge of everything the driver is
dealing with. :-)

------------------------------------------------------------------------
15.2 Understanding hardware error reports

When the driver detects an unexpected error condition, it may display a 
message of the following pattern.

sym53c876-0:1: ERROR (0:48) (1-21-65) (f/95) @ (script 7c0:19000000).
sym53c876-0: script cmd = 19000000
sym53c876-0: regdump: da 10 80 95 47 0f 01 07 75 01 81 21 80 01 09 00.

Some fields in such a message may help you understand the cause of the 
problem, as follows:

sym53c876-0:1: ERROR (0:48) (1-21-65) (f/95) @ (script 7c0:19000000).
............A.........B.C....D.E..F....G.H.......I.....J...K.......

Field A : target number.
  SCSI ID of the device the controller was talking with at the moment the 
  error occurs.

Field B : DSTAT io register (DMA STATUS)
  Bit 0x40 : MDPE Master Data Parity Error
             Data parity error detected on the PCI BUS.
  Bit 0x20 : BF   Bus Fault
             PCI bus fault condition detected
  Bit 0x01 : IID  Illegal Instruction Detected
             Set by the chip when it detects an Illegal Instruction format 
             on some condition that makes an instruction illegal.
  Bit 0x80 : DFE Dma Fifo Empty
             Pure status bit that does not indicate an error.
  If the reported DSTAT value contains a combination of MDPE (0x40), 
  BF (0x20), then the cause may be likely due to a PCI BUS problem.

Field C : SIST io register (SCSI Interrupt Status)
  Bit 0x08 : SGE  SCSI GROSS ERROR
             Indicates that the chip detected a severe error condition 
             on the SCSI BUS that prevents the SCSI protocol from functionning
             properly.
  Bit 0x04 : UDC  Undexpected Disconnection
             Indicates that the device released the SCSI BUS when the chip 
             was not expecting this to happen. A device may behave so to 
             indicate the SCSI initiator that an error condition not reportable        
      using the SCSI protocol has occured.
  Bit 0x02 : RST  SCSI BUS Reset
             Generally SCSI targets donnot reset the SCSI BUS, although any 
             device on the BUS can reset it at any time.
  Bit 0x01 : PAR  Parity
             SCSI parity error detected.
  On a faulty SCSI BUS, any error condition among SGE (0x08), UDC (0x04) and 
  PAR (0x01) may be detected by the chip. If your SCSI system sometimes 
  encounters such error conditions, especially SCSI GROSS ERROR, then a SCSI 
  BUS problem is likely the cause of these errors.

For fields D,E,F,G and H, you may look into the sym53c8xx_defs.h file 
that contains some minimal comments on IO register bits.
Field D : SOCL  Scsi Output Control Latch
          This register reflects the state of the SCSI control lines the 
          chip want to drive or compare against.
Field E : SBCL  Scsi Bus Control Lines
          Actual value of control lines on the SCSI BUS.
Field F : SBDL  Scsi Bus Data Lines
          Actual value of data lines on the SCSI BUS.
Field G : SXFER  SCSI Transfer
          Contains the setting of the Synchronous Period for output and 
          the current Synchronous offset (offset 0 means asynchronous).
Field H : SCNTL3 Scsi Control Register 3
          Contains the setting of timing values for both asynchronous and 
          synchronous data transfers. 

Understanding Fields I, J, K and dumps requires to have good knowledge of 
SCSI standards, chip cores functionnals and internal driver data structures.
You are not required to decode and understand them, unless you want to help 
maintain the driver code.
-------------------------------------------------------------------------

G�rard.


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to