On Wednesday 25 September 2002 07:25 pm, you wrote: > On Wednesday September 25 2002 08:25 am, Marcia wrote: > > Dear Tom > > > > Tom Brinkman wrote: > > >>Since the 'nopentium' bandaid didn't fix it, let's start again > > >>Marcia. List the hardware involved, particularly mobo, psu, video, > > >> and what Mandrake version, which video drivers are used. Ram > > >> vendor, if you know? IIRC, it's Mdk 8.2, with an ECS mobo. Got > > >> the model/ revision/bios vendor and numbers? > > > > The link for my board is http://www.ecsusa.com/ and my motherboard is > > the L7VMM. > > AMD apprv'd for your 1600+, unfortuntely, I have no experience with > these new mico-boards (an i'm not an ECS fan). The lastest bios is 1.0a > http://www.ecsusa.com/ecsusa/www.ecs.com.tw/download/l7vmm.htm > "1. Remove "CPU warning temp item" in BIOS setup > The ITE8705 chipset use the same high and low limit for "CPU warning > temp & CPU shutdown temp" > 2. To fix Hynix 128M X 2 or Samsung 128M X 2 system will auto-restart > when running" > .... either fix could be pertinent to your crash problem, so update if > you don't already have 1.0a. Both are worrisome in that they deal with > auto shutdowns (crashes), one for temp, the other for ram.
I will update the bios then for starters. I have never done this so what is the procedure for doing this? > > I disabled the onboard lan because even though it worked > > > it was grabbing the same irq as sound. The company sent me a new lan > > card which helped that it seems. This is an Athlon 1600+ XP with > > 512MB PC2100 DDR, 266 MHZ SDRAM, > > Yes, but who makes the ram. Two important points, the actual ram > chips and the pcb (board) implemetation of the chips. IOW's Micron > chips (good) on a generic pcb (bad) ... well two wrongs don't make a > right ;> Look in bios and see what the ram timings are. The most > lenient are CAS 3-3-3, and if there's a setting for 'bank > interleaving', disable it. At least till we tryin get your crash > problem solved, go for lenient. 2-2-2 and 4-bank are the optimum, but > only good ram on a good mobo with a good PSU can do it. > Also it's 133 Mhz x2 ram. (the x2, and DDR are mostly maketing talk) > > Probly now's a good time to run the machine overnite booting to > memtest86. Look on your CD's, or use SoftwareManager, you should find > somethin like memtest86-3.0-2mdk . Install that rpm, it'll add a > memtest86 boot option to lilo (or grub). When you re-boot, choose this > option and let the tests run overnite. > > Plan B, if your machine doesn't like booting this option, then look > in /boot. After installin the memtest rpm you'll see a file like > memtest-3.0.bin. So put in a good floppy and type > 'dd if=/boot/memtest-3.0.bin of=/dev/fd0' (caution your memtest version > is probly differnet than mine). That'll make an memtest86 floppy you > can boot from. Just choose 'floppy' from lilo. If you can't run > memtest86 overnite with -0- errors, then we probly have found the > problem ... the ram, or how well your motherboards gets along with it, > or both. Could still be PSU tho. > > I had the cooler master added plus > > > an extra case fan. This is a brand new machine. I have Win95 as a > > dual boot and Win does not have the problems that my Linux side has. > > Win9.x --> WinXP tolerates sloppy (win)hardware, actually encourages > it IMO. Most all CoolerMaster hs/fans are AMD appr'vd, so we probly > don't need to look there. I'd advise you tho, that it's probly usin a > thermal pad to contact the cpu's die, and this will deteriorate over > time, might even fail. Thermal grease is much better, now and later. > > > cat /proc/interrupts > > > > 11: 154 XT-PIC usb-uhci, usb-uhci > > What USB devices do you have? Appears two are sharing IRQ11 or it's > possibly a double entry. Everything else looked good. I have a usb HP 4300 scanjet scanner and a HP 940c usb printer. > > > There is a temperature and performance utility in the bios. What are > > lm_sensors/gkrellm? I would gladly install this if needed. > > Most common causes of random, occaisional lockups and reboots are > faulty ram, or overheating. Even a lot of Windoze problems get blamed > on M$, when these two culprits are really at fault (specially Winsux > Registry errors). > > The temp you see in bios is really only good for verifying that you > have hardware support for temp, voltage, fan monitoring. When you see > this temp the system is not under load, and usually is comin from a > cool state. Specially if it's been off for more'n just a few seconds. > Processor core temp is _very_ dynamic. Also there's only a very few > current mobo's that can really access AthlonXP internal diode core > temps (Asus, Gigabyte). All other boards, including yours an' mine, > measure the temp from an external probe. 'Bout like tryin to see if the > electric wires inside a wall are too hot, by holding your hand against > the sheetrock. Still it's somethin to go by. Figure your cpu core temp > is 10 to 20C hotter than the probe reports tho. > > So we need lm_sensors. It's on your CD's, install > liblm_sensors1-2.6.4-4mdk > lm_sensors-2.6.4-4mdk ...or just type 'lm-sensors' into > SoftwareManager. We won't fool with gkrellm just yet. After the rpms > are installed, su to root and run 'sensors-detect'. All the default > answers to the questions it presents should be ok, just keep hitting > <Enter>. When it get's towards the end, it'll output some lines that > you need to edit into the end of either /etc/rc.d/rc.local and > /etc/modules.conf While we're at it, add 'i2c-proc' (w/o the quotes) > to /etc/modules. Gettin back to 'sensors-detect', it probly has one > more question ... install the sensors.conf file?, say Yes. Then back > in ('cd' to) /etc/rc.d/ ... type './rc.local' to restart rc.local > and have the modules take effect. Then as user you should see > temp/voltage/fan outputs when you type 'sensors' in a terminal. Some > have reported a reboot is necessary, but I've never needed to. > > We'll concentrate on the cpu temp for now. The cpu temp should stay > under 60C (from a probe), under 55C is better under extreme load (eg, a > kernel compile, specially 'make modules', running cpuburn, etc.) For > normal operation it should be under the low 50's to mid 40's. It's > during high temp spikes or sustained load that systems freeze or > spontaneously reboots occur. Keep an eye on system voltages too tho, > they should be very close to slightly (+10%) over the voltages spec'd > for your motherboard/cpu, and stay very steady. > > So for the acid test, cpuburn. It's probly on your CD's, if not get > it here http://users.ev1.net/~redelm/ For your XP 1600+ you want > to run 'burnK7'. While doin so, in another terminal check the output of > 'sensors' frequently. If the cpu temp climbs to 65C and starts going > over, abort burnK7 (Ctrl+C), and figure out what you need to do to > improve cooling. 'Cause that's most likely your crash problem. If you > notice the -5 and -12 volt ouputs droping too low (more'n 10%), then > the PSU could be the problem. Voltage drops cause lockups/freezes too. > If it all looks OK, and you can run burnK7 for an hour, your crash > problem almost surely isn't hardware. > > Sorry for being so long winded, but I warned you that dianosing > hardware over the phone was difficult ;) Thank you very much for your detailed information here. I really appreciate your time on this. I just got this tonight so will study and try these things the next few days. I will let you know my results. I am sure this will be resolved eventually. Thanks again. Sincerely, Marcia
Want to buy your Pack or Services from MandrakeSoft? Go to http://www.mandrakestore.com