You're problems occur only with RAID and the 2940U2W? 

I was told it should be a SCSI problem, not a RAID problem. So, I quit
trying to debug RAID issues and went on to SCSI possibilities. Maybe the
guru was wrong?

My latest attempt _may_ bear some fruit. I tried lowering the speed from
80Mb to 40Mb through the controller bios (thanks to C Polisher for the
suggestion). It's hard for me to catch the hangs, but so far this
morning, it _appears_ to have stopped (I've thought I found the problem
before). Of course, it's a pretty rotten solution to have to go to half
speed. Better not to have any RAID.

I have a similar setup to yours: ASUS P3B-F, single PII400Mhz, 512MB
with 2940U2W (Adaptec bios v. 2.20.0) and a matched pair of U2W Seagate
Cheetah drives; Linux 2.2.14 patched with Mingo's raid-2.2.14-B1, as you
have. Using RAID-1 only. My system hangs momentarily (30 seconds to
several minutes) but then comes back on its own. No error messages.

Just to let you know what I've tried:
 
 * set up remote logging to catch any error messages (none logged);

 * tried creating hang by creating heavy read/write load, no luck;

 * set SCSI logging on with 'echo "scsi log error 1" /proc/scsi/scsi';

 * upgraded the aic7xxx from 5.1.21 to 5.1.28 (performance improved 
   but it still hangs);

 * physically removed IDE drives, turned off IDE support in motherboard,
   and removed kernel support for IDE, based on someone's hunch;

 * lowered the motherboard's front bus speed from 100.3Mhz to 88.3 
   (underclocking  the CPU), based on another hunch;

 * checked IRQs for conflicts; 

 * compiled the kernel with and without tagged command queuing, with 
   more and less max commands;

 * tried kernels 2.1.10 to 2.1.14 with appropriate RAID patches;

 * e-mailed the current aic7xxx maintainer, Doug Ledford 
   [EMAIL PROTECTED] for assistance, no response.

On the todo debugging list:

 * obtain another SCSI cable to add discrete terminator (currently 
   have built-in terminator that came with the card);

 * purchase a case thermometer to verify air flow is okay.

Of course, if you are only having your problems with RAID, I guess I
should make the effort to remove RAID, and see if it solves my problems.

Regards,

Jeff Hill

[EMAIL PROTECTED] wrote:
> 
> Hi All!
> 
> Lately, we have been experiencing some serious problems with our Linux servers
> using RAID0 on Adaptec 2940U2W. The machines, which are under quite some load,
> suddenly dies and must be cold-restarted. When they get back online again,
> there's is no sign of anything going awry in any logfile. The just plunge into
> deep-freeze, zero-Kelvin mode. *argh*
> 
> Currently, the machines are running Linux 2.2.14 with latest raid-patches
> (Mingo's raid-2.2.14-B1-patch), but we've seen the problem under 2.2.13 as
> well.
> 
> As I said, there's nothing in the log files that would indicate what's wrong.
> Installing the software watchdog kernel module/watchdogd didn't help either.
> 
> The situation is getting somewhat embarrassing, as we've been pushing pretty
> hard towards Linux. We're considering moving all servers to non-RAID
> configurations, but we'd really prefer RAID0.
> 
> I've also noticed a few other postings about problems/hangups with 2940/AIC79xx
> on Linux RAID, so it seems we're not alone with this problem.
> 
> Does anyone have any kind of information as to the status of this. Is the
> bug(s) identified? Is there a solution (other than stop using RAID)?
> 
> Hardware setup: RH Linux 6.1/2.2.14/raid-2.2.14-B1 on dual PIII motherboards
> (ASUS P2B-DS) and U2W SCSI IBM disks, 512+ MB RAM.
> 
> /m

-- 
------------------------------------------------------------
------  HR On-Line:  The Network for Workplace Issues ------
http://www.hronline.com - Ph:416-604-7251 - Fax:416-604-4708
------------------------------------------------------------

Reply via email to