* [EMAIL PROTECTED] [2006-06-29 20:21]:
> I'm trying to make a spare drive the "hot spare", without rebooting my
> OpenBSD 3.9 server. Bioctl is letting me at least query my raid array,
> but it's not letting me set an "Unused" drive to "Hot Spare":
there's bugs with -H in 3.9, current hs it betterer.
Hi again, it took me a while but I've installed the snapshot from
2006.09.25, put my LSI SATA 300-8x through its paces, and found there
are still issues with setting hot spares on the 300-8x.
300-8x firmware info
====================
Here is the relevant card information (as displayed on boot-up):
LSI MegaRAID BIOS Version H425 (Build Nov 17, 2004)
Copyright(c) 2004 LSI Logic Corp.
HA-1 (Bus 3 Dev 14) LSI MegaRAID SATA PCI-X
Standard FW 813G DRAM =128MB (SDRAM)
That is, I am running firmware version 813G. [According to the LSILogic
website, it was released on 2005.03.11, and is now 5 versions old.]
Problem summary (problems with bioctl -H on a SATA 300-8x)
===============
To summarize (I've included the full test case below) - I can now use
bioctl -H to set an "Unused" drive to "Hot spare". However, despite
showing as hot spare in *both* bioctl and the LSI boot menu, when I
fail a drive in my RAID array, the "hot spare" fails to behave as such
(it will not be integrated into the degraded RAID array).
It gets worse - once a drive has been set as a hot spare through bioctl,
it can never be changed back to unused, nor can it be properly set as a
hotspare through the LSI boot menu. Essentially that slot is now
unusable. The only solution that I have found is to "Clear
configuration" from the LSI boot menu (which then requires reinstall of
the contents of the drives).
Problem workaround (avoid bioctl -H on my SATA 300-8x)
------------------
If I only set hot spares through the LSI boot menu, the RAID array
behaves as expected. Unfortunately this requires rebooting.
Request for help
================
So, I'm wondering if someone can guide me through peeking the memory on
this card, so we can compare the difference between setting a hotspare
through bioctl, and setting a hotspare through the LSI boot menu.
Alternately, I can upgrade the firmware and re-run the test case if you
would prefer.
Thanks,
Matthew Mulrooney
The test cases
==============
[Pardon the lack of terminal captures - I forgot to transfer my
typescript logs off the partition before recreating the array :(.
These notes were taken from a second machine.]
s => step succeeded
F => step failed
Normal case (RAID 5 + one hot spare)
-----------
s Configure array from the LSI boot menu
s Clear configuration
s New configuration
s Disks 0, 1, 2: RAID 5 array
s Disk 3: Hot spare
s Install OpenBSD-snapshot-2006.09.25
s Single disk failure
s Disk 0: Fails (I pulled it from the CSE-M35T1 enclosure)
s Disk 3: Automatically replaces it
s Replace failed disk
s Replace Disk 0 with a new disk
s Observe that Disk 0 is marked as "Unused" through bioctl
s Set Disk 0 to be a hot spare (through bioctl)
s Single disk failure
s Disk 1: Fails (I pulled it)
F Disk 0: FAILS TO GET INTEGRATED, DESPITE STILL BEING MARKED AS A
HOT SPARE
[Now I'm in what's-the-best-of-a-bad-set-of-options mode]
s Replace the failed drive
s Disk 1: Replaced
s Observe that the RAID 5 array automatically gets rebuilt
s Wait until the rebuild is complete
s Reboot and hope that the LSI card init properly marks the hotspare as
a hotspare
s Single disk failure
s Disk 1: Fails
F Disk 0: FAILS TO GET INTEGRATED, DESPITE STILL BEING MARKED AS A
HOT SPARE
[So the reboot didn't magically make the hot spare work properly.]
s Replace the failed drive
s Disk 1: Replaced, and the RAID 5 array automatically gets rebuilt
s Wait until the rebuild is complete
s Reboot, enter into the LSI boot menu
s Configure > View/Add Configurarion
s Highlight disk 0 > F4 (hot spare)
s "The Physical Drive is already a HOTSPARE\nPress any key to
continue"
s F10 (Configure), Esc, Esc
s "Exit?" = YES
s "Please Press Ctrl-Alt-Del to REBOOT the system", CTRL-ALT-DEL
s Single disk failure
s Disk 1: Fails
F Disk 0: FAILS TO GET INTEGRATED, DESPITE STILL BEING MARKED AS A
HOT SPARE (as indicated by bioctl anyway)
[OK, this is really bad. I have no way of telling the RAID card to
recognize the hotspare, even from the LSI boot menu. The boot
menu is already showing it as a hot spare.]
s Reboot
s LSI boot menu
s Configure > Easy Configuration > Esc > Esc
s "Exit?" = YES
s "Please Press Ctrl-Alt-Del to REBOOT the system", CTRL-ALT-DEL
[From this point, it looks as if I have no options. Even doing an easy
configuration didn't fix things. That drive is still reported as a
hot spare by both bioctl and the LSI boot menu; but it is *not* being
integrated into a degraded array. My only options is to clear the
configuration from the LSI boot, and rebuild my system from scratch.]
Finding a path that does work (avoid bioctl -H)
-----------------------------
s Configured from LSI boot menu
s Clear configuration
s New configuration
s Disks 0, 1, 2: RAID 5 array
s Disk 3: Hot spare
s Initialize > Logical Drive 1
s Install OpenBSD-snapshot-2006.09.25
s Single disk failure
s Disk 0: Fails
s Disk 3: Automatically replaces it
s Wait for the RAID 5 set to rebuild
s Replace disk 0 (now showing as "Unused" by bioctl)
s Reboot into LSI boot menu
s Set disk 0 as Hot spare
s Boot into OpenBSD
s bioctl reports disk 0:0.0 as "Hot spare"
s Fail disk 1
s Watch the RAID 5 array automatically incorporate disk 0 and rebuild
s Once array has finished rebuilding
s Replace disk 1
s Reboot into LSI boot menu
s Set Disk 1 as a hot spare (this cannot be done until the array has
finished rebuilding)
s Reboot into OpenBSD and continue watching it rebuild
[No problems as long as I only set my hot spares through the LSI boot
menu.]
System info (dmesg)
===========
OpenBSD 4.0-current (GENERIC) #1112: Mon Sep 25 03:49:49 MDT 2006
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Pentium(R) 4 CPU 2.40GHz ("GenuineIntel" 686-class) 2.40
GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID
real mem = 1072197632 (1047068K)
avail mem = 970051584 (947316K)
using 4256 buffers containing 53710848 bytes (52452K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(9a) BIOS, date 10/20/05, BIOS32 rev. 0 @
0xfb6d0, SMBIOS rev. 2.3 @ 0xf0800 (41 entries)
bios0: Supermicro P4SC8
apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
apm0: flags 70102 dobusy 1 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xf0000/0xdf64
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfde80/224 (12 entries)
pcibios0: PCI Exclusive IRQs: 5 9 10 11 12
pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 6300ESB LPC" rev
0x00)
pcibios0: PCI bus #4 is the last bus
bios0: ROM list: 0xc0000/0x8000 0xc8000/0x2200
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "Intel 82875P Host" rev 0x02
ppb0 at pci0 dev 3 function 0 "Intel 82875P PCI-CSA" rev 0x02
pci1 at ppb0 bus 1
em0 at pci1 dev 1 function 0 "Intel PRO/1000CT (82547GI)" rev 0x00: irq
11, address 00:30:48:87:ad:e4
ppb1 at pci0 dev 28 function 0 "Intel 6300ESB PCIX" rev 0x02
pci2 at ppb1 bus 2
ppb2 at pci2 dev 2 function 0 "Intel IOP331 PCIX-PCIX" rev 0x07
pci3 at ppb2 bus 3
ami0 at pci3 dev 14 function 0 "Symbios Logic MegaRAID SATA 4x/8x" rev
0x07: irq 9
ami0: LSI 3008, 32b, FW 813G, BIOS vH425, 128MB RAM
ami0: 1 channels, 0 FC loops, 1 logical drives
scsibus0 at ami0: 40 targets
sd0 at scsibus0 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct
fixed
sd0: 284194MB, 284194 cyl, 64 head, 32 sec, 512 bytes/sec, 582029312 sec
total
scsibus1 at ami0: 16 targets
uhci0 at pci0 dev 29 function 0 "Intel 6300ESB USB" rev 0x02: irq 10
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 29 function 1 "Intel 6300ESB USB" rev 0x02: irq 12
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
"Intel 6300ESB WDT" rev 0x02 at pci0 dev 29 function 4 not configured
"Intel 6300ESB APIC" rev 0x02 at pci0 dev 29 function 5 not configured
ehci0 at pci0 dev 29 function 7 "Intel 6300ESB USB" rev 0x02: irq 5
usb2 at ehci0: USB revision 2.0
uhub2 at usb2
uhub2: Intel EHCI root hub, rev 2.00/1.00, addr 1
uhub2: 4 ports with 4 removable, self powered
ppb3 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0x0a
pci4 at ppb3 bus 4
vga1 at pci4 dev 9 function 0 "ATI Rage XL" rev 0x27
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
em1 at pci4 dev 10 function 0 "Intel PRO/1000MT (82541GI)" rev 0x00: irq
12, address 00:30:48:87:ad:e5
ichpcib0 at pci0 dev 31 function 0 "Intel 6300ESB LPC" rev 0x02
pciide0 at pci0 dev 31 function 1 "Intel 6300ESB IDE" rev 0x02: DMA,
channel 0 configured to compatibility, channel 1 configured to
compatibility
pciide0: channel 0 disabled (no drives)
pciide0: channel 1 disabled (no drives)
pciide1 at pci0 dev 31 function 2 "Intel 6300ESB SATA" rev 0x02: DMA,
channel 0 configured to native-PCI, channel 1 configured to native-PCI
pciide1: using irq 11 for native-PCI interrupt
ichiic0 at pci0 dev 31 function 3 "Intel 6300ESB SMBus" rev 0x02: irq 9
iic0 at ichiic0
isa0 at ichpcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
lm0 at isa0 port 0x290/8: W83627HF
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask ff65 netmask ff65 ttymask ffe7
pctr: user-level cycle counter enabled
uhub2: device problem, disabling port 1
dkcsum: sd0 matches BIOS drive 0x80
root on sd0a
rootdev=0x400 rrootdev=0xd00 rawdev=0xd02