Howdy, Linux SCSI folks. I'm working with a friend who recently moved
his Linux box from Red Hat to Debian, and in the process upgraded the
kernel to 2.4.2.
He's running software RAID5 on his QLogic ISP2100 Fibre Channel card,
hooked up to a 6-disk fibre channel shelf, and when he did a full
backup and put some stress on the drives, the following appeared in
the logs (date and machine name stripped for readability), followed by
a spontaneous *reboot*. (yikes!)
10:26:22 md: syncing RAID array md0
10:26:22 md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc.
10:26:22 md: using maximum available idle IO bandwith (but not more than 100000
KB/sec) for reconstruction.
10:26:22 md: using 124k window, over a total of 8883840 blocks.
10:26:22 sdf1 [events: 00000006](write) sdf1's sb offset: 8883840
10:26:22 sde1 [events: 00000006](write) sde1's sb offset: 8883840
10:26:22 sdd1 [events: 00000006](write) sdd1's sb offset: 8883840
10:26:22 sdc1 [events: 00000006](write) sdc1's sb offset: 8883840
10:26:22 sdb1 [events: 00000006](write) sdb1's sb offset: 8883840
10:26:22 .
10:26:22 ... autorun DONE.
10:26:27 raid5: switching cache buffer size, 4096 --> 1024
10:26:28 qlogicfc0 : no handle slots, this should not happen.
10:26:28 hostdata->queued is 2c, in_ptr: 26
10:26:28 qlogicfc0 : no handle slots, this should not happen.
10:26:28 hostdata->queued is 2c, in_ptr: 76
10:26:28 qlogicfc0 : no handle slots, this should not happen.
[snip about 100 lines of different in_ptr no handle slots reports,
with in_ptr ranging all over the place from 3 to f to 54 to b to 69,
etc. etc.]
What's bizarre is that in the middle of these errors, the *kernel
logs* started getting filled with the filenames that were being
backed up:
Mar 13 17:05:46 malchus kernel: hostdata->queued is 2b, in_ptr: a
Mar 13 17:05:46 malchus kernel: qlogicfc0 : no handle slots, this should not
happen/xfm_tiff.xpm
./oldroot/usr/X11R6/lib/X11/xfm/pixmaps/xfm_updir.xpm
./oldroot/usr/X11R6/lib/X11/xfm/pixmaps/xfm_uu.xpm
./oldroot/usr/X11R6/lib/X11/xfm/pixmaps/xfm_word.xpm
./oldroot/usr/X11R6/lib/X11/xfm/pixmaps/xfm_xbm.xpm
[snip]
until, I guess, the reboot itself:
./oldroot/usr/X11R6/lib/X11/config/Threads.tmpl
./oldroot/usr/X11R6/lib/X11/config/Win32.cf
./oldroMar 13 17:18:38 malchus kernel: klogd 1.3-3#33.1, log source = /proc/kmsg
started.
Mar 13 17:18:38 malchus kernel: Inspecting /boot/System.map-2.4.2
Yes, it happens in the middle of the line like that.
I downloaded the latest drivers (isp2x00-v1.3.tgz) from
http://www.iol.unh.edu/consortiums/fc/linux/qlogic.html and dumped the
three files qlogicfc.c, qlogicfc.h, and qlogicfc_asm.c over the top of
the files in drivers/scsi/ and did a successful recompile, but even
more bizarrely, the ISP2100 card wasn't even detected at all after
rebooting with the new kernel!
*Nothing* else was changed in the config, we just compiled over again
with the new drivers, and the card wasn't mentioned at all in the
SCSI initialization.
Any suggestions? Is the qlogic driver page above not up-to-date with
the 2.4.2 kernel, maybe? We need some stability here, and while we
can move back to the 2.2.* kernel, it'd be really nice to have
iptables..
I can do some more debugging if it's needed, but this box has lots of
users, and we'd like to keep reboots and debugging time to a minimum.
For what it's worth, the ISP2100 is hooked up to 6 Seagate ST19171FC
drives.
Any help would be appreciated!
Ben Gertzfield
--
Brought to you by the letters Y and M and the number 16.
"I wanna be Twist Barbie!"
Debian GNU/Linux maintainer of Gimp and GTK+ -- http://www.debian.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]