You're problems occur only with RAID and the 2940U2W?
I was told it should be a SCSI problem, not a RAID problem. So, I quit
trying to debug RAID issues and went on to SCSI possibilities. Maybe the
guru was wrong?
My latest attempt _may_ bear some fruit. I tried lowering the speed from
80Mb to 40Mb through the controller bios (thanks to C Polisher for the
suggestion). It's hard for me to catch the hangs, but so far this
morning, it _appears_ to have stopped (I've thought I found the problem
before). Of course, it's a pretty rotten solution to have to go to half
speed. Better not to have any RAID.
I have a similar setup to yours: ASUS P3B-F, single PII400Mhz, 512MB
with 2940U2W (Adaptec bios v. 2.20.0) and a matched pair of U2W Seagate
Cheetah drives; Linux 2.2.14 patched with Mingo's raid-2.2.14-B1, as you
have. Using RAID-1 only. My system hangs momentarily (30 seconds to
several minutes) but then comes back on its own. No error messages.
Just to let you know what I've tried:
* set up remote logging to catch any error messages (none logged);
* tried creating hang by creating heavy read/write load, no luck;
* set SCSI logging on with 'echo "scsi log error 1" /proc/scsi/scsi';
* upgraded the aic7xxx from 5.1.21 to 5.1.28 (performance improved
but it still hangs);
* physically removed IDE drives, turned off IDE support in motherboard,
and removed kernel support for IDE, based on someone's hunch;
* lowered the motherboard's front bus speed from 100.3Mhz to 88.3
(underclocking the CPU), based on another hunch;
* checked IRQs for conflicts;
* compiled the kernel with and without tagged command queuing, with
more and less max commands;
* tried kernels 2.1.10 to 2.1.14 with appropriate RAID patches;
* e-mailed the current aic7xxx maintainer, Doug Ledford
[EMAIL PROTECTED] for assistance, no response.
On the todo debugging list:
* obtain another SCSI cable to add discrete terminator (currently
have built-in terminator that came with the card);
* purchase a case thermometer to verify air flow is okay.
Of course, if you are only having your problems with RAID, I guess I
should make the effort to remove RAID, and see if it solves my problems.
Regards,
Jeff Hill
[EMAIL PROTECTED] wrote:
>
> Hi All!
>
> Lately, we have been experiencing some serious problems with our Linux servers
> using RAID0 on Adaptec 2940U2W. The machines, which are under quite some load,
> suddenly dies and must be cold-restarted. When they get back online again,
> there's is no sign of anything going awry in any logfile. The just plunge into
> deep-freeze, zero-Kelvin mode. *argh*
>
> Currently, the machines are running Linux 2.2.14 with latest raid-patches
> (Mingo's raid-2.2.14-B1-patch), but we've seen the problem under 2.2.13 as
> well.
>
> As I said, there's nothing in the log files that would indicate what's wrong.
> Installing the software watchdog kernel module/watchdogd didn't help either.
>
> The situation is getting somewhat embarrassing, as we've been pushing pretty
> hard towards Linux. We're considering moving all servers to non-RAID
> configurations, but we'd really prefer RAID0.
>
> I've also noticed a few other postings about problems/hangups with 2940/AIC79xx
> on Linux RAID, so it seems we're not alone with this problem.
>
> Does anyone have any kind of information as to the status of this. Is the
> bug(s) identified? Is there a solution (other than stop using RAID)?
>
> Hardware setup: RH Linux 6.1/2.2.14/raid-2.2.14-B1 on dual PIII motherboards
> (ASUS P2B-DS) and U2W SCSI IBM disks, 512+ MB RAM.
>
> /m
--
------------------------------------------------------------
------ HR On-Line: The Network for Workplace Issues ------
http://www.hronline.com - Ph:416-604-7251 - Fax:416-604-4708
------------------------------------------------------------