A while back, this disk array began throwing SCSI errors (after having
worked like a champ for more than a year).  We've replace the SCSI
controller, cable, and terminator, and Maxtronic has replaced the array's
internal SCSI cabling and run lengthy diagnostics.

The array's diagnostics report nothing wrong, and the RHEL system sees the
device (sdb) during bootup (even getting the capacity right), successfully
attaches it, but then encounters a string of resets, and, finally offlines
it.

Oct 26 07:26:29 s31e066 kernel: scsi2 : Adaptec AIC79XX PCI-X SCSI HBA
DRIVER, R
ev 3.0
Oct 26 07:26:29 s31e066 kernel:         <Adaptec 39320A Ultra320 SCSI
adapter>
Oct 26 07:26:29 s31e066 kernel:         aic7902: Ultra320 Wide Channel A,
SCSI I
d=7, PCI 33 or 66Mhz, 512 SCBs
Oct 26 07:26:29 s31e066 kernel:   Type:   Direct-Access                   
  ANS
I SCSI revision: 05
Oct 26 07:26:29 s31e066 kernel: scsi2:A:0:0: Tagged Queuing enabled.  Depth 4
Oct 26 07:26:29 s31e066 kernel:  target2:0:0: FAST-160 WIDE SCSI 320.0
MB/s DT I
U QAS RTI WRFLOW PCOMP (6.25 ns, offset 127)
Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: 3900325888 512-byte hdwr
sectors (1996967 MB)
Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: drive cache: write back
Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: 3900325888 512-byte hdwr
sectors (1996967 MB)
Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: drive cache: write back
Oct 26 07:26:29 s31e066 kernel: sd 2:0:0:0: Attached scsi disk sdb
Oct 26 07:27:46 s31e066 kernel: scsi2: At time of recovery, card was not
paused
Oct 26 07:27:46 s31e066 kernel: scsi2: Dumping Card State at program
address 0x2
4 Mode 0x22
Oct 26 07:27:46 s31e066 kernel: SCSISIGI[0x24] SCSIPHASE[0x0] SCSIBUS[0x0]
LASTP
HASE[0x1]
Oct 26 07:27:46 s31e066 kernel: SCSISEQ0[0x40] SCSISEQ1[0x12] SEQCTL0[0x0]
SEQINTCTL[0x0]
Oct 26 07:27:46 s31e066 kernel: MK_MESSAGE_SCSIID[0xff] SSTAT0[0x10]
SSTAT1[0x0]
Oct 26 07:27:46 s31e066 kernel:   3 FIFO_USE[0x0] SCB_CONTROL[0x60]
SCB_SCSIID[0x7]
Oct 26 07:27:46 s31e066 kernel: scsi2: FIFO0 Free, LONGJMP == 0x8259, SCB 0x3
Oct 26 07:27:46 s31e066 kernel: scsi2: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3
Oct 26 07:27:46 s31e066 kernel: scsi2: LQISTATE = 0x1, LQOSTATE = 0x1a,
OPTIONMODE = 0x52
Oct 26 07:27:46 s31e066 kernel: scsi2: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
Oct 26 07:27:46 s31e066 kernel: scsi2: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
Oct 26 07:27:46 s31e066 kernel: scsi2: REG0 == 0x3, SINDEX = 0x102, DINDEX
= 0x102
Oct 26 07:27:46 s31e066 kernel: scsi2: SCBPTR == 0xff03, SCB_NEXT ==
0xff00, SCB_NEXT2 == 0x0
Oct 26 07:27:46 s31e066 kernel: scsi2:0:0:0: Cmd aborted from QINFIFO
Oct 26 07:27:56 s31e066 kernel: scsi2: At time of recovery, card was not
paused
Oct 26 07:27:56 s31e066 kernel: scsi2: Dumping Card State at program
address 0x24 Mode 0x22
Oct 26 07:27:s31e066 kernel: SCSISIGI[0x24] SCSIPHASE[0x0] SCSIBUS[0x0]
LASTPHASE[0x1]
TCTL[0x0]
Oct 26 07:27:56 s31e066 kernel: MK_MESSAGE_SCSIID[0xff] SSTAT0[0x10]
SSTAT1[0x0]
Oct 26 07:27:56 s31e066 kernel:   3 FIFO_USE[0x0] SCB_CONTROL[0x60]
SCB_SCSIID[0x7]
Oct 26 07:27:56 s31e066 kernel: scsi2: FIFO0 Free, LONGJMP == 0x8259, SCB 0x3
Oct 26 07:27:56 s31e066 kernel: scsi2: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3
Oct 26 07:27:56 s31e066 kernel: scsi2: LQISTATE = 0x1, LQOSTATE = 0x1a,
OPTIONMODE = 0x52
Oct 26 07:27:56 s31e066 kernel: scsi2: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
Oct 26 07:27:56 s31e066 kernel: scsi2: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
Oct 26 07:27:56 s31e066 kernel: scsi2: REG0 == 0x3, SINDEX = 0x102, DINDEX
= 0x102
Oct 26 07:27:56 s31e066 kernel: scsi2: SCBPTR == 0xff03, SCB_NEXT ==
0xff00, SCB_NEXT2 == 0x0
Oct 26 07:27:56 s31e066 kernel: scsi2:0:0:0: Cmd aborted from QINFIFO
Oct 26 07:27:56 s31e066 kernel: scsi2: Device reset code sleeping
Oct 26 07:28:01 s31e066 kernel: scsi2: Device reset timer expired (active 1)
Oct 26 07:28:01 s31e066 kernel: scsi2: Device reset returning 0x2003
Oct 26 07:28:11 s31e066 kernel: sd 2:0:0:0: scsi: Device offlined - not
ready after error recovery

Hoping there is someone who might have a similar device and a solution,
but otherwise open to any suggestions.  Thanks.




-- 
Tim Evans, TKEvans.com, Inc.    |   5 Chestnut Court
UNIX System Admin Consulting    |   Owings Mills, MD 21117
http://www.tkevans.com/         |   443-394-3864
http://www.come-here.com/News/  |   [email protected]


_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to