>is it correct that this failing kernel didnt have the RAID patch applied?
No, it was with raid too (all pre15 runs were). But this specific test result is not
of much use anyway, since there was an eventually problematic reject with the Unified
IDE patch (but still: that kernel survived longest).
>(B) is quite unlikely if you do not have it applied and the box still
crashes? My understanding is that others who had IDE+SMP problems could
reproduce it without RAID as well. (RAID0 stresses the hardware harder)
The only worthwile things in the equation besides 2.2.13pre15 seem to be the
raid-patch and the hardware.
But the raid software runs on SCSI without known problems, it didn�t have a bad SMP
track record lately like the IDE driver (if ever), and it�s design isn�t affected by
an overly complicated and quirky hardware (like the IDE driver).
I agree with you that the raid patch is a really bad candidate for the lockup.
Unfortunately I only could do few tests with pre15, and only one failure looked like
an IDE lockup. Not a good basis for any pre15 IDE instability claims anyway. Could
still be a hardware issue or, well, cosmic rays. Or a CPU heat problem, as Mike Black
suggested. Or anything else in 2.2.13pre15 (mm, ext2, whatever).
Probably only systematic testing could resolve that. I unfortunately could not do
this, since I was running out of time. The IDE raid array now is in a single processor
machine. This one ran like a clockwork for 24 hours under heavy stress, on all five
drives, on five controllers, udma33 enabled. With 2.2.13pre15 (kernel as outlined in
the previous posting). The SMP machine works flawlessly too under stress for 24h, with
one IDE drive (same kernel).
Judging from the tests it seems to me that 2.2.12 to 2.2.13pre14 had easily
reproducible IDE+SMP stability problems, and 2.2.13pre15 eventually still has sporadic
ones. But generally only with several IDE controllers used at the same time.
==============
On Wed, 6 Oct 1999 [EMAIL PROTECTED] wrote:
> One more pre15 test:
> 2.2.13pre15 with Unified IDE 2.2.13pre14-19991003 (two rejects in ide.c, one ok, one
>probably harmless):
> (5) dual P3 machine: NULL deref after 6 hours (i.e. this pre15 kernel survived
>longest)
is it correct that this failing kernel didnt have the RAID patch applied?
> I can think of these possible reasons for the SMP problems:
>
> (A) SMP race(s) in IDE driver in original 2.2.13pre15
> (B) SMP-deadlock in raid-2.2.11-patch
(B) is quite unlikely if you do not have it applied and the box still
crashes? My understanding is that others who had IDE+SMP problems could
reproduce it without RAID as well. (RAID0 stresses the hardware harder)
-- mingo
---Header---
Received: from chiara.csoma.elte.hu (chiara.csoma.elte.hu [157.181.71.18])
by rivalnet.de (8.9.3/8.9.3) with ESMTP id RAA03555
for <[EMAIL PROTECTED]>; Wed, 6 Oct 1999 17:31:34 +0200
Received: (from mingo@localhost)
by chiara.csoma.elte.hu (8.8.8/8.8.8/c) id RAA03472;
Wed, 6 Oct 1999 17:31:14 +0200
Date: Wed, 6 Oct 1999 17:37:20 +0200 (CEST)
From: Ingo Molnar <[EMAIL PROTECTED]>
Sender: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: 2.2.13pre15 SMP+IDE test summary
In-Reply-To: <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
--
the online community service for gamers & friends - http://www.rivalnet.com
* unterst�tzt �ber 50 PC-Spiele im Multiplayer-Modus
* Dateien senden & empfangen bis 500 MB am St�ck
* Newsgroups, Mail, Chat & mehr