A while back, this disk array began throwing SCSI errors (after having worked like a champ for more than a year). We've replace the SCSI controller, cable, and terminator, and Maxtronic has replaced the array's internal SCSI cabling and run lengthy diagnostics.
The array's diagnostics report nothing wrong, and the RHEL system sees the device (sdb) during bootup (even getting the capacity right), successfully attaches it, but then encounters a string of resets, and, finally offlines it. Oct 26 07:26:29 s31e066 kernel: scsi2 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, R ev 3.0 Oct 26 07:26:29 s31e066 kernel: <Adaptec 39320A Ultra320 SCSI adapter> Oct 26 07:26:29 s31e066 kernel: aic7902: Ultra320 Wide Channel A, SCSI I d=7, PCI 33 or 66Mhz, 512 SCBs Oct 26 07:26:29 s31e066 kernel: Type: Direct-Access ANS I SCSI revision: 05 Oct 26 07:26:29 s31e066 kernel: scsi2:A:0:0: Tagged Queuing enabled. Depth 4 Oct 26 07:26:29 s31e066 kernel: target2:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT I U QAS RTI WRFLOW PCOMP (6.25 ns, offset 127) Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: 3900325888 512-byte hdwr sectors (1996967 MB) Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: drive cache: write back Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: 3900325888 512-byte hdwr sectors (1996967 MB) Oct 26 07:26:29 s31e066 kernel: SCSI device sdb: drive cache: write back Oct 26 07:26:29 s31e066 kernel: sd 2:0:0:0: Attached scsi disk sdb Oct 26 07:27:46 s31e066 kernel: scsi2: At time of recovery, card was not paused Oct 26 07:27:46 s31e066 kernel: scsi2: Dumping Card State at program address 0x2 4 Mode 0x22 Oct 26 07:27:46 s31e066 kernel: SCSISIGI[0x24] SCSIPHASE[0x0] SCSIBUS[0x0] LASTP HASE[0x1] Oct 26 07:27:46 s31e066 kernel: SCSISEQ0[0x40] SCSISEQ1[0x12] SEQCTL0[0x0] SEQINTCTL[0x0] Oct 26 07:27:46 s31e066 kernel: MK_MESSAGE_SCSIID[0xff] SSTAT0[0x10] SSTAT1[0x0] Oct 26 07:27:46 s31e066 kernel: 3 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x7] Oct 26 07:27:46 s31e066 kernel: scsi2: FIFO0 Free, LONGJMP == 0x8259, SCB 0x3 Oct 26 07:27:46 s31e066 kernel: scsi2: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3 Oct 26 07:27:46 s31e066 kernel: scsi2: LQISTATE = 0x1, LQOSTATE = 0x1a, OPTIONMODE = 0x52 Oct 26 07:27:46 s31e066 kernel: scsi2: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 Oct 26 07:27:46 s31e066 kernel: scsi2: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0 Oct 26 07:27:46 s31e066 kernel: scsi2: REG0 == 0x3, SINDEX = 0x102, DINDEX = 0x102 Oct 26 07:27:46 s31e066 kernel: scsi2: SCBPTR == 0xff03, SCB_NEXT == 0xff00, SCB_NEXT2 == 0x0 Oct 26 07:27:46 s31e066 kernel: scsi2:0:0:0: Cmd aborted from QINFIFO Oct 26 07:27:56 s31e066 kernel: scsi2: At time of recovery, card was not paused Oct 26 07:27:56 s31e066 kernel: scsi2: Dumping Card State at program address 0x24 Mode 0x22 Oct 26 07:27:s31e066 kernel: SCSISIGI[0x24] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1] TCTL[0x0] Oct 26 07:27:56 s31e066 kernel: MK_MESSAGE_SCSIID[0xff] SSTAT0[0x10] SSTAT1[0x0] Oct 26 07:27:56 s31e066 kernel: 3 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x7] Oct 26 07:27:56 s31e066 kernel: scsi2: FIFO0 Free, LONGJMP == 0x8259, SCB 0x3 Oct 26 07:27:56 s31e066 kernel: scsi2: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3 Oct 26 07:27:56 s31e066 kernel: scsi2: LQISTATE = 0x1, LQOSTATE = 0x1a, OPTIONMODE = 0x52 Oct 26 07:27:56 s31e066 kernel: scsi2: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 Oct 26 07:27:56 s31e066 kernel: scsi2: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0 Oct 26 07:27:56 s31e066 kernel: scsi2: REG0 == 0x3, SINDEX = 0x102, DINDEX = 0x102 Oct 26 07:27:56 s31e066 kernel: scsi2: SCBPTR == 0xff03, SCB_NEXT == 0xff00, SCB_NEXT2 == 0x0 Oct 26 07:27:56 s31e066 kernel: scsi2:0:0:0: Cmd aborted from QINFIFO Oct 26 07:27:56 s31e066 kernel: scsi2: Device reset code sleeping Oct 26 07:28:01 s31e066 kernel: scsi2: Device reset timer expired (active 1) Oct 26 07:28:01 s31e066 kernel: scsi2: Device reset returning 0x2003 Oct 26 07:28:11 s31e066 kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error recovery Hoping there is someone who might have a similar device and a solution, but otherwise open to any suggestions. Thanks. -- Tim Evans, TKEvans.com, Inc. | 5 Chestnut Court UNIX System Admin Consulting | Owings Mills, MD 21117 http://www.tkevans.com/ | 443-394-3864 http://www.come-here.com/News/ | [email protected] _______________________________________________ rhelv5-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/rhelv5-list
