Re: 2.4.x very unstable on 8-way IBM 8500R
On Fri, 2 Mar 2001, ext Alan Cox wrote: > > > (from Red Hat 7) but very erratic on all 2.4-kernels I've tried it with > > > (2.4.[012], compiled both with egcs and RH7's gcc-2.96, both share the > > > > Under redhat 7 you should use kgcc to compile the kernel, since gcc2.96 is > > So he was using egcs, and whether he had the pre-errata gcc 2.96 > wouldnt matter Since this (once again) came up... I've been running 2.4.[012] on my home box compiled with 2.96-errata without a single problem so far. And yes I know it's not supported, consider this just a datapoint :) - Panu - > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
On Thu, 1 Mar 2001, Dr. Kelsey Hudson wrote: >> I've been playing around with 8-way IBM8500R (8x700MHz Xeon) with 4.5GB >> memory & AIC7xxx SCSI-controller. It's perfectly stable with 2.2-kernel >> (from Red Hat 7) but very erratic on all 2.4-kernels I've tried it with >> (2.4.[012], compiled both with egcs and RH7's gcc-2.96, both share the > >Under redhat 7 you should use kgcc to compile the kernel, since gcc2.96 is >inherently broken(*). http://www.bero.org/gcc296.html >> same symptoms). It did have a ServeRAID controller too but IBM suggested >> we take it out since 4500R also had problems with it on 2.4 but it didn't >> make any difference at all. Also tried to turn off highmem support but >> didn't make difference either. > >(*) redhat chose to ship an experimental compiler with this release of > the distribution that has a great many bugs. to ensure proper kernel > compillation another proven version of gcc was included, but called > kgcc instead. You should always use this to compile your kernels > under redhat 7 until the newer version of gcc is released. http://www.bero.org/gcc296.html -- Mike A. Harris - Linux advocate - Free Software advocate This message is copyright 2001, all rights reserved. Views expressed are my own, not necessarily shared by my employer. -- Red Hat Linux: http://www.redhat.com Download for free: ftp://ftp.redhat.com/pub/redhat/redhat-6.2/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
On Thu, Mar 01, 2001 at 05:04:09PM -0800, Dr. Kelsey Hudson wrote: > On Thu, 1 Mar 2001, Matilainen Panu (NRC/Helsinki) wrote: > > > I've been playing around with 8-way IBM8500R (8x700MHz Xeon) with 4.5GB > > memory & AIC7xxx SCSI-controller. It's perfectly stable with 2.2-kernel > > (from Red Hat 7) but very erratic on all 2.4-kernels I've tried it with > > (2.4.[012], compiled both with egcs and RH7's gcc-2.96, both share the > > Under redhat 7 you should use kgcc to compile the kernel, since gcc2.96 is > inherently broken(*). > For the umpteenth time, no it isn't. There are serious bugs in the shipped version of gcc in RedHat 7.0, but they are fixed by applying the update. The reason for supplying kgcc is to allow building a 2.2 kernel, because of bugs in the kernel, NOT the compiler. > > same symptoms). It did have a ServeRAID controller too but IBM suggested > > we take it out since 4500R also had problems with it on 2.4 but it didn't > > make any difference at all. Also tried to turn off highmem support but > > didn't make difference either. > > (*) redhat chose to ship an experimental compiler with this release of > the distribution that has a great many bugs. to ensure proper kernel > compillation another proven version of gcc was included, but called > kgcc instead. You should always use this to compile your kernels > under redhat 7 until the newer version of gcc is released. > No. Provided you grab the update, you can build the 2.4 kernel perfectly happily using the RedHat gcc snapshot. I'm running it successfully on a number of machines. The issue with 2.4 on certain Netfinities is a bad interaction between the NMI watchdog code and the systems management card. Changing compilers makes no difference. Tim -- Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED] IBM Linux Technology Center, Beaverton, Oregon Interested in Linux scalability ? Look at http://lse.sourceforge.net/ "Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
Just FYI, I am chasing this problem. There appears to be an unpleasant interaction between the Advanced Systems Management card and the NMI watchdog code. Ripping the card out of the machine also eradicates the problem, but is less desirable. I'll let people know when there's a better solution. Tim On Thu, Mar 01, 2001 at 03:30:56PM +0200, Matilainen Panu (NRC/Helsinki) wrote: > On Thu, 1 Mar 2001, ext Andrew Morton wrote: > > "Matilainen Panu (NRC/Helsinki)" wrote: > > > On Thu, 1 Mar 2001, ext Andrew Morton wrote: > > > > > > > > Is it stable with `nmi_watchdog=0'? > > > > > > If the default value for nmi_watchdog is 0 then no - I added the > > > nmi_watchdog=1 just to see if that makes any difference. If it's on by > > > default then I'll need to test it that way. > > > > Default for nmi_watchdog is `enabled'. > > > > Several people have reported that turning it off with > > the `nmi_watchdog=0' LILO option makes systems stable. > > Nobody knows why. > > > > (If nmi_watchdog _does_ make the achine stable, please > > tell linux-kernel.). > > It's too early to say for sure but that seems to have fixed it. Uptime now > nearly an hour under loads of 20-30 which is way more than it has been > able to stay up before. I'll let you know whether its still up tomorrow. > > Million thanks for the tip! > > - Panu - > -- Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED] IBM Linux Technology Center, Beaverton, Oregon Interested in Linux scalability ? Look at http://lse.sourceforge.net/ "Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
"Dr. Kelsey Hudson" wrote: > Under redhat 7 you should use kgcc to compile the kernel, since gcc2.96 is > inherently broken(*). Or upgrade to the current Red Hat 7 gcc, which works quite well. jjs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
> > (from Red Hat 7) but very erratic on all 2.4-kernels I've tried it with > > (2.4.[012], compiled both with egcs and RH7's gcc-2.96, both share the > Under redhat 7 you should use kgcc to compile the kernel, since gcc2.96 is So he was using egcs, and whether he had the pre-errata gcc 2.96 wouldnt matter - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
On Thu, 1 Mar 2001, Matilainen Panu (NRC/Helsinki) wrote: > I've been playing around with 8-way IBM8500R (8x700MHz Xeon) with 4.5GB > memory & AIC7xxx SCSI-controller. It's perfectly stable with 2.2-kernel > (from Red Hat 7) but very erratic on all 2.4-kernels I've tried it with > (2.4.[012], compiled both with egcs and RH7's gcc-2.96, both share the Under redhat 7 you should use kgcc to compile the kernel, since gcc2.96 is inherently broken(*). > same symptoms). It did have a ServeRAID controller too but IBM suggested > we take it out since 4500R also had problems with it on 2.4 but it didn't > make any difference at all. Also tried to turn off highmem support but > didn't make difference either. (*) redhat chose to ship an experimental compiler with this release of the distribution that has a great many bugs. to ensure proper kernel compillation another proven version of gcc was included, but called kgcc instead. You should always use this to compile your kernels under redhat 7 until the newer version of gcc is released. talk to you later, Kelsey Hudson [EMAIL PROTECTED] Software Engineer Compendium Technologies, Inc (619) 725-0771 --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.x very unstable on 8-way IBM 8500R
On Thu, 1 Mar 2001, ext Andrew Morton wrote: > "Matilainen Panu (NRC/Helsinki)" wrote: > > On Thu, 1 Mar 2001, ext Andrew Morton wrote: > > > > > > Is it stable with `nmi_watchdog=0'? > > > > If the default value for nmi_watchdog is 0 then no - I added the > > nmi_watchdog=1 just to see if that makes any difference. If it's on by > > default then I'll need to test it that way. > > Default for nmi_watchdog is `enabled'. > > Several people have reported that turning it off with > the `nmi_watchdog=0' LILO option makes systems stable. > Nobody knows why. > > (If nmi_watchdog _does_ make the achine stable, please > tell linux-kernel.). It's too early to say for sure but that seems to have fixed it. Uptime now nearly an hour under loads of 20-30 which is way more than it has been able to stay up before. I'll let you know whether its still up tomorrow. Million thanks for the tip! - Panu - > -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.4.x very unstable on 8-way IBM 8500R
Hi, I've been playing around with 8-way IBM8500R (8x700MHz Xeon) with 4.5GB memory & AIC7xxx SCSI-controller. It's perfectly stable with 2.2-kernel (from Red Hat 7) but very erratic on all 2.4-kernels I've tried it with (2.4.[012], compiled both with egcs and RH7's gcc-2.96, both share the same symptoms). It did have a ServeRAID controller too but IBM suggested we take it out since 4500R also had problems with it on 2.4 but it didn't make any difference at all. Also tried to turn off highmem support but didn't make difference either. Symptoms: it sometimes boots and stays up for a while (anything between 10 seconds to maximum of about half an hour) but most of the time it locks up early in the boot while its enabling the CPUs: -- Booting processor 4/3 eip 2000 Setting warm reset code and vector. 1. 2. 3. Asserting INIT. Waiting for send to finish... +Deasserting INIT. Waiting for send to finish... +#startup loops: 2. Sending STARTUP #1. After apic_write. Startup point 1. Waiting for send to finish... +Sending STARTUP #2. After apic_write. Startup point 1. Waiting for send to finish... +After Startup. Before Callout 4. After Callout 4. -- ..this is where it *usually* locks up, but the processor number where it hangs varies randomly. Also it has locked up in other places too a couple of times. If it boots and crashes then there's nothing in the logs, it's just a sudden hard lockup. If it is booted with "nosmp noapic" it seems perfectly stable but I'd sure like to use those other 7 CPU's too :) Any ideas/suggestions/patches etc would be greatly appreciated... - Panu - Here's a bootlog of a rare succesfull boot (hopefully got the copy-paste right...) --- Inspecting /boot/System.map-2.4.2 Loaded 14798 symbols from /boot/System.map-2.4.2. Symbols match kernel version 2.4.2. No module symbols loaded. ESR value after enabling vector: Calibrating delay loop... 1399.19 BogoMIPS Stack at about c5cd7fb8 CPU: Before vendor init, caps: 0383fbff , vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 1024K Intel machine check reporting enabled on CPU#2. CPU: After vendor init, caps: 0383fbff CPU: After generic, caps: 0383fbff CPU: Common caps: 0383fbff OK. CPU2: Intel Pentium III (Cascades) stepping 01 CPU has booted. Booting processor 3/2 eip 2000 Setting warm reset code and vector. 1. 2. 3. Asserting INIT. Waiting for send to finish... +Deasserting INIT. Waiting for send to finish... +#startup loops: 2. Sending STARTUP #1. After apic_write. Startup point 1. Waiting for send to finish... +Sending STARTUP #2. After apic_write. Startup point 1. Waiting for send to finish... +After Startup. Before Callout 3. After Callout 3. Initializing CPU#3 CPU#3 (phys ID: 2) waiting for CALLOUT CALLIN, before setup_local_APIC(). masked ExtINT on CPU#3 ESR value before enabling vector: ESR value after enabling vector: Calibrating delay loop... 1399.19 BogoMIPS Stack at about c5cd5fb8 CPU: Before vendor init, caps: 0383fbff , vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 1024K Intel machine check reporting enabled on CPU#3. CPU: After vendor init, caps: 0383fbff CPU: After generic, caps: 0383fbff CPU: Common caps: 0383fbff OK. CPU3: Intel Pentium III (Cascades) stepping 01 CPU has booted. Booting processor 4/3 eip 2000 Setting warm reset code and vector. 1. 2. 3. Asserting INIT. Waiting for send to finish... +Deasserting INIT. Waiting for send to finish... +#startup loops: 2. Sending STARTUP #1. After apic_write. Startup point 1. Waiting for send to finish... +Sending STARTUP #2. After apic_write. Startup point 1. Waiting for send to finish... +After Startup. Before Callout 4. After Callout 4. Initializing CPU#4 CPU#4 (phys ID: 3) waiting for CALLOUT CALLIN, before setup_local_APIC(). masked ExtINT on CPU#4 ESR value before enabling vector: ESR value after enabling vector: Calibrating delay loop... 1399.19 BogoMIPS Stack at about c5cd3fb8 CPU: Before vendor init, caps: 0383fbff , vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 1024K Intel machine check reporting enabled on CPU#4. CPU: After vendor init, caps: 0383fbff CPU: After generic, caps: 0383fbff CPU: Common caps: 0383fbff OK. CPU4: Intel Pentium III (Cascades) stepping 01 CPU has booted. Booting processor 5/4 eip 2000 Setting warm reset code and vector. 1. 2. 3. Asserting INIT. Waiting for send to finish... +Deasserting INIT. Waiting for send to finish... +#startup loops: 2. Sending STARTUP #1. After apic_write. Startup point 1. Waiting for send to finish... +Sending STARTUP #2. After apic_write. Startup point 1. Waiting for send to finish... +After Startup. Before Callout 5. Af