Leonard N. Zubkoff wrote: > > Generally, the Mylex PCI RAID controllers take disks offline when certain types > of unrecoverable errors occur. The driver will log the reason for any disk > being killed as a console message. Without further information as to precisely > why the disks were taken offline and whether they all were taken offline > simultaneously, it's hard to know what happened. Firmware bugs in either the > controller firmware or disk drives are a plausible reason, as would be a > problem with the SCSI controller chip on the AcceleRAID, or an electrical > problem on the SCSI bus. I removed the Mylex controler, and since it's a production server, I cannot do experiments on this one, so I cannot get any more informations (I'm sorry about that because I find very important to spend some time helping maintainers fix things). Also I know that dmesg output is important for mainteners, so please find below what I saved before removing the controler. Please also notice that the output contains many lines related to the fact that I tryed to force disks back online. I'm sorry about not having the very first error messages, but they was so much output from indirect troubles that append after the inititial problem that dmesg beginning was already troncated when I logged on the machine. The most significant message is probably: DAC960#0: Physical Drive 0:x killed because of bad tag returned from drive but I don't find it meaningfull at all, and since there is no source code available to scan, that's why I stopped trying to cope with this controler. Also, if you want to test the controler yourself, I can ask my compagny to send and give it to you, since we are not going to use it any more. In the next monthes, I will use the same disks set, with the same cables (except the one linking to the controler, since the pins are different) driven using Linux software RAID and the Tekram DC390U2W, so I will send you any news about a failure that would append. I also discovered that Mylex is not using the last megabyte on each disk, so I can use persistent-superblock 1 in my new /etc/raidtab file. This can be interresting for you to know since it states that changing from Mylex AcceleRAID to Linux software RAID 0.90 can be done without clearing datas. Regards, and many thanks for the great work that you do, even if my personal experiment is leading me to drop all sophisticated devices and rather use simpler ones where the sophisticated features being performed in free (source code available) softwares, that I can read in case of failure. Hubert Tonneau
***** DAC960 RAID Driver Version 2.2.4 of 23 August 1999 ***** Copyright 1998-1999 by Leonard N. Zubkoff <[EMAIL PROTECTED]> Configuring Mylex DAC960PTL1 PCI RAID Controller Firmware Version: 4.06-0-60, Channels: 1, Memory Size: 8MB PCI Bus: 0, Device: 13, Function: 1, I/O Address: Unassigned PCI Address: 0xF6800000 mapped at 0xD0000000, IRQ Channel: 9 Controller Queue Depth: 128, Maximum Blocks per Command: 128 Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 128/32 Physical Devices: 0:1 Vendor: SEAGATE Model: ST150176LC Revision: 0001 Serial Number: NQ050821000019480B8Z 0:2 Vendor: SEAGATE Model: ST150176LC Revision: 0001 Serial Number: NQ0518520000194804HG Disk Status: Dead, 97691648 blocks, 4 resets 0:3 Vendor: SEAGATE Model: ST150176LC Revision: 0001 Serial Number: NQ05160200001948K2Z7 Disk Status: Dead, 97691648 blocks, 4 resets 0:4 Vendor: SEAGATE Model: ST150176LC Revision: 0001 Serial Number: NQ01051700001948JQZA Disk Status: Dead, 97691648 blocks, 4 resets 0:5 Vendor: SEAGATE Model: ST150176LC Revision: 0001 Serial Number: NQ050859000019480B5E Disk Status: Dead, 97691648 blocks, 4 resets 0:6 Vendor: SEAGATE Model: ST150176LC Revision: 0001 Serial Number: NQ02821600001948JQMF Disk Status: Dead, 97691648 blocks, 4 resets Logical Drives: /dev/rd/c0d0: RAID-5, Offline, 390766592 blocks, Write Back No Rebuild or Consistency Check in Progress DAC960#0: Make Online of Physical Drive 0:6 Succeeded DAC960#0: Physical Drive 0:6 is now ONLINE DAC960#0: Make Online of Physical Drive 0:6 Illegal DAC960#0: Make Online of Physical Drive 0:2 Succeeded DAC960#0: Physical Drive 0:2 is now ONLINE DAC960#0: Make Online of Physical Drive 0:3 Succeeded DAC960#0: Physical Drive 0:3 is now ONLINE DAC960#0: Make Online of Physical Drive 0:1 Illegal DAC960#0: Make Online of Physical Drive 0:4 Succeeded DAC960#0: Make Online of Physical Drive 0:5 Succeeded DAC960#0: Physical Drive 0:4 is now ONLINE DAC960#0: Physical Drive 0:5 is now ONLINE DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE DAC960#0: Make Online of Physical Drive 0:6 Illegal DAC960#0: Make Online of Physical Drive 0:1 Illegal rd/c0d0: unknown partition table DAC960#0: Physical Drive 0:2 killed because of bad tag returned from drive DAC960#0: Physical Drive 0:3 killed because of bad tag returned from drive DAC960#0: Physical Drive 0:4 killed because of bad tag returned from drive DAC960#0: Physical Drive 0:5 killed because of bad tag returned from drive DAC960#0: Physical Drive 0:6 killed because of bad tag returned from drive DAC960#0: Physical Drive 0:2 killed because it was removed DAC960#0: Physical Drive 0:2 is now DEAD DAC960#0: Physical Drive 0:3 is now DEAD DAC960#0: Physical Drive 0:4 is now DEAD DAC960#0: Physical Drive 0:5 is now DEAD DAC960#0: Physical Drive 0:6 is now DEAD DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now OFFLINE NET4: AppleTalk 0.18 for Linux NET4.0 DAC960#0: Make Online of Physical Drive 0:6 Failed - Unable to Start Device DAC960#0: Make Online of Physical Drive 0:1 Illegal DAC960#0: Make Online of Physical Drive 0:2 Failed - Unable to Start Device DAC960#0: Make Online of Physical Drive 0:3 Failed - Unable to Start Device DAC960#0: Make Online of Physical Drive 0:4 Failed - Unable to Start Device