On Wednesday September 25 2002 08:25 am, Marcia wrote: > Dear Tom > > Tom Brinkman wrote: > >>Since the 'nopentium' bandaid didn't fix it, let's start again > >>Marcia. List the hardware involved, particularly mobo, psu, video, > >> and what Mandrake version, which video drivers are used. Ram > >> vendor, if you know? IIRC, it's Mdk 8.2, with an ECS mobo. Got > >> the model/ revision/bios vendor and numbers? > > The link for my board is http://www.ecsusa.com/ and my motherboard is > the L7VMM.
AMD apprv'd for your 1600+, unfortuntely, I have no experience with these new mico-boards (an i'm not an ECS fan). The lastest bios is 1.0a http://www.ecsusa.com/ecsusa/www.ecs.com.tw/download/l7vmm.htm "1. Remove "CPU warning temp item" in BIOS setup The ITE8705 chipset use the same high and low limit for "CPU warning temp & CPU shutdown temp" 2. To fix Hynix 128M X 2 or Samsung 128M X 2 system will auto-restart when running" .... either fix could be pertinent to your crash problem, so update if you don't already have 1.0a. Both are worrisome in that they deal with auto shutdowns (crashes), one for temp, the other for ram. I disabled the onboard lan because even though it worked > it was grabbing the same irq as sound. The company sent me a new lan > card which helped that it seems. This is an Athlon 1600+ XP with > 512MB PC2100 DDR, 266 MHZ SDRAM, Yes, but who makes the ram. Two important points, the actual ram chips and the pcb (board) implemetation of the chips. IOW's Micron chips (good) on a generic pcb (bad) ... well two wrongs don't make a right ;> Look in bios and see what the ram timings are. The most lenient are CAS 3-3-3, and if there's a setting for 'bank interleaving', disable it. At least till we tryin get your crash problem solved, go for lenient. 2-2-2 and 4-bank are the optimum, but only good ram on a good mobo with a good PSU can do it. Also it's 133 Mhz x2 ram. (the x2, and DDR are mostly maketing talk) Probly now's a good time to run the machine overnite booting to memtest86. Look on your CD's, or use SoftwareManager, you should find somethin like memtest86-3.0-2mdk . Install that rpm, it'll add a memtest86 boot option to lilo (or grub). When you re-boot, choose this option and let the tests run overnite. Plan B, if your machine doesn't like booting this option, then look in /boot. After installin the memtest rpm you'll see a file like memtest-3.0.bin. So put in a good floppy and type 'dd if=/boot/memtest-3.0.bin of=/dev/fd0' (caution your memtest version is probly differnet than mine). That'll make an memtest86 floppy you can boot from. Just choose 'floppy' from lilo. If you can't run memtest86 overnite with -0- errors, then we probly have found the problem ... the ram, or how well your motherboards gets along with it, or both. Could still be PSU tho. I had the cooler master added plus > an extra case fan. This is a brand new machine. I have Win95 as a > dual boot and Win does not have the problems that my Linux side has. Win9.x --> WinXP tolerates sloppy (win)hardware, actually encourages it IMO. Most all CoolerMaster hs/fans are AMD appr'vd, so we probly don't need to look there. I'd advise you tho, that it's probly usin a thermal pad to contact the cpu's die, and this will deteriorate over time, might even fail. Thermal grease is much better, now and later. > cat /proc/interrupts | > 11: 154 XT-PIC usb-uhci, usb-uhci What USB devices do you have? Appears two are sharing IRQ11 or it's possibly a double entry. Everything else looked good. > > There is a temperature and performance utility in the bios. What are > lm_sensors/gkrellm? I would gladly install this if needed. Most common causes of random, occaisional lockups and reboots are faulty ram, or overheating. Even a lot of Windoze problems get blamed on M$, when these two culprits are really at fault (specially Winsux Registry errors). The temp you see in bios is really only good for verifying that you have hardware support for temp, voltage, fan monitoring. When you see this temp the system is not under load, and usually is comin from a cool state. Specially if it's been off for more'n just a few seconds. Processor core temp is _very_ dynamic. Also there's only a very few current mobo's that can really access AthlonXP internal diode core temps (Asus, Gigabyte). All other boards, including yours an' mine, measure the temp from an external probe. 'Bout like tryin to see if the electric wires inside a wall are too hot, by holding your hand against the sheetrock. Still it's somethin to go by. Figure your cpu core temp is 10 to 20C hotter than the probe reports tho. So we need lm_sensors. It's on your CD's, install liblm_sensors1-2.6.4-4mdk lm_sensors-2.6.4-4mdk ...or just type 'lm-sensors' into SoftwareManager. We won't fool with gkrellm just yet. After the rpms are installed, su to root and run 'sensors-detect'. All the default answers to the questions it presents should be ok, just keep hitting <Enter>. When it get's towards the end, it'll output some lines that you need to edit into the end of either /etc/rc.d/rc.local and /etc/modules.conf While we're at it, add 'i2c-proc' (w/o the quotes) to /etc/modules. Gettin back to 'sensors-detect', it probly has one more question ... install the sensors.conf file?, say Yes. Then back in ('cd' to) /etc/rc.d/ ... type './rc.local' to restart rc.local and have the modules take effect. Then as user you should see temp/voltage/fan outputs when you type 'sensors' in a terminal. Some have reported a reboot is necessary, but I've never needed to. We'll concentrate on the cpu temp for now. The cpu temp should stay under 60C (from a probe), under 55C is better under extreme load (eg, a kernel compile, specially 'make modules', running cpuburn, etc.) For normal operation it should be under the low 50's to mid 40's. It's during high temp spikes or sustained load that systems freeze or spontaneously reboots occur. Keep an eye on system voltages too tho, they should be very close to slightly (+10%) over the voltages spec'd for your motherboard/cpu, and stay very steady. So for the acid test, cpuburn. It's probly on your CD's, if not get it here http://users.ev1.net/~redelm/ For your XP 1600+ you want to run 'burnK7'. While doin so, in another terminal check the output of 'sensors' frequently. If the cpu temp climbs to 65C and starts going over, abort burnK7 (Ctrl+C), and figure out what you need to do to improve cooling. 'Cause that's most likely your crash problem. If you notice the -5 and -12 volt ouputs droping too low (more'n 10%), then the PSU could be the problem. Voltage drops cause lockups/freezes too. If it all looks OK, and you can run burnK7 for an hour, your crash problem almost surely isn't hardware. Sorry for being so long winded, but I warned you that dianosing hardware over the phone was difficult ;) -- Tom Brinkman Corpus Christi, Texas
Want to buy your Pack or Services from MandrakeSoft? Go to http://www.mandrakestore.com