Re: Random Crashes
Stephen Olander Waters wrote: Turn off Chipkill in the BIOS unless you know for a fact that your RAM is single rank (x4bit). -s Hi Stephen, turning off chipkill seems to work for me! Thank you very much! Bye, Giorgio. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Random Crashes
Hello all, Ever since I bought my AMD64 system (AMD 64 3000+, Asus K8U-X motherboard) I've been experiencing random freezes, in which the system completely stops to respond, or sometimes automatic reboots. Sometimes the system halts during boot with a message such as HARDWARE ERROR CPU 0: Machine Check Exception:4 Bank 4: b2070f0f TSC a38a02f0b This is not a software problem! The crashes do not necessarily happen when the system is doing something ram or processor intensive. I can do heavy tasks such as video encoding with no problems, but sometimes the system crashes when it's idle, only background tasks running. Also, the crashes are not so frequent. When I bought the system, it had one stick with 512Mb of RAM. Crashes already happened then. Later I added anoter stick with 1Gb of RAM. I suspected the memory, and ran memtest only. But it was for a short time, so in fact I cannot conclude anything from the lack of errors. So I took of the old 512Mb ram module, because it should be the one with problems, since the crashes happened already when I had only that one. The system still crashed. Just to be sure, I put it on again, and only this one, and the system also crashes. The motherboard has two slots for RAM. I tried both modules in both slots, and I did notice that when a module (either one) is in one of the slots, the system crashes just after boot --- at most I can type the password and let KDE start, but it crashes before KDE is fully loaded. With a module in the other slot, then the system is usable most of the times. So, am I really unlucky to have two memory modules with problems, or what else should I suspect? Motherboard? Processor? What would be the possible ways to diagnose the problem? -- They told me I was gullible ... and I believed them! Eduardo M KALINOWSKI [EMAIL PROTECTED] http://move.to/hpkb -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Random Crashes
On Mon, May 07, 2007 at 11:42:08AM -0300, Eduardo M KALINOWSKI wrote: Hello all, Ever since I bought my AMD64 system (AMD 64 3000+, Asus K8U-X motherboard) I've been experiencing random freezes, in which the system completely stops to respond, or sometimes automatic reboots. Sometimes the system halts during boot with a message such as HARDWARE ERROR CPU 0: Machine Check Exception:4 Bank 4: b2070f0f TSC a38a02f0b This is not a software problem! The crashes do not necessarily happen when the system is doing something ram or processor intensive. I can do heavy tasks such as video encoding with no problems, but sometimes the system crashes when it's idle, only background tasks running. Also, the crashes are not so frequent. When I bought the system, it had one stick with 512Mb of RAM. Crashes already happened then. Later I added anoter stick with 1Gb of RAM. I suspected the memory, and ran memtest only. But it was for a short time, so in fact I cannot conclude anything from the lack of errors. So I took of the old 512Mb ram module, because it should be the one with problems, since the crashes happened already when I had only that one. The system still crashed. Just to be sure, I put it on again, and only this one, and the system also crashes. The motherboard has two slots for RAM. I tried both modules in both slots, and I did notice that when a module (either one) is in one of the slots, the system crashes just after boot --- at most I can type the password and let KDE start, but it crashes before KDE is fully loaded. With a module in the other slot, then the system is usable most of the times. So, am I really unlucky to have two memory modules with problems, or what else should I suspect? Motherboard? Processor? What would be the possible ways to diagnose the problem? A google search on that error message seems to indicate that it has been seen on some systems where the power supply wasn't sufficient to provide stable power to the system. Other posibilities is that the ram simply isn't stable (although bad power can make ram not stable of course). What size power supply, what brand/model, and how much hardware is in that system? -- Len Sorensen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Random Crashes
On Mon, May 07, 2007 at 11:42:08AM -0300 Eduardo M KALINOWSKI said: So, am I really unlucky to have two memory modules with problems, or what else should I suspect? Motherboard? Processor? What would be the possible ways to diagnose the problem? Have you tried running the box with two similar memory modules? Sam -- (Sam Varghese) http://www.gnubies.com Program testing can best show the presence of errors but never their absence. - Edsger Wybe Dijkstra My PGP key: http://www.gnubies.com/encryption/sign.txt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Random Crashes
Eduardo M KALINOWSKI wrote: Hello all, Ever since I bought my AMD64 system (AMD 64 3000+, Asus K8U-X motherboard) I've been experiencing random freezes.. Hi Eduardo, I get the same problem with my Asus K8V Deluxe - AMD Athlon 64 +3200, but only with debian (etch) AMD64. With other kind of arch (i.e. i486, i686 k7) no problems at all. I guess the problem is with AMD64 flavour. Just my 2 cents. Bye, Giorgio. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Random Crashes
Eduardo M KALINOWSKI wrote: Hello all, Ever since I bought my AMD64 system (AMD 64 3000+, Asus K8U-X motherboard) I've been experiencing random freezes.. Hi Eduardo, I get the same problem with my Asus K8V Deluxe - AMD Athlon 64 +3200, but only with debian (etch) AMD64. With other kind of arch (i.e. i486, i686 k7) no problems at all. I guess the problem is with AMD64 flavour. I don't think it's a software problem... I forgot to mention in the original mail, but there is nothing in the system logs. It just freezes or reboots. -- Eduardo M Kalinowski [EMAIL PROTECTED] http://move.to/hpkb
Re: Random Crashes
On Mon, 2007-05-07 at 17:35 +, Jack Malmostoso wrote: On Mon, 07 May 2007 16:50:13 +0200, Eduardo M KALINOWSKI wrote: HARDWARE ERROR CPU 0: Machine Check Exception:4 Bank 4: b2070f0f TSC a38a02f0b This is not a software problem! Turn off Chipkill in the BIOS unless you know for a fact that your RAM is single rank (x4bit). -s -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Random Crashes
On Mon, May 07, 2007 at 03:10:34PM -0300, Eduardo M Kalinowski wrote: The power supply could indeed be the problem. It's no great power supply, and I have two (PATA) HDs. No fancy graphics card, though --- only a SIS 315. I have also a DVD burner. However, I've been able to successfully burn DVDs (when more power would be needed, I guess). Still, changing the power supply would be the easiest thing for me to do. At the worst, I'll have a good one to use when I decide to build a new system. (Unlike the DDR1 modules that the motherboard uses.) A friend of mine had a system some years ago that crashed very often and corrupted disk contents. Replacing the power supply with a nice high quality name brand one eliminated all the crashes and disk corruption. He noticed the voltage monitoring in the bios one day and found that some of the rails were bouncing up and down a lot and were not really that close to what they should be. With a new power supply the voltages became completely steady as did the system. Remember a quality 300W will easily handle more load than a generic who knows what 500W. And the more they weight the better they are is fairly accurate for power supplies, although I am sure there are exceptions to that. -- Len Sorensen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Still having random crashes after installing X with nvidia
On Mon, May 22, 2006 at 08:48:44AM -0400, [EMAIL PROTECTED] wrote: Unless I get some help from this list, the next thing I'll be trying is going throuth the list of optional addons to X that dpkg-reconfigure xorg-server provides and taking them out. Things like direct rendering and GL support and such. Is this likely to keep nvidia's openGL suport from working? Well! turning all that stuff off stops the crashes. Would it be useful trying to isolate the problem to one of those options, maybe by a kind of binary search? I.e. are there people interested in getting a report and trying to fix the problem, or are the comm channels to nvidia (I presume that's where the reports have to go) hopelessly clogged? Yes, please post your report for the archives. The Nvidia docs say that the dri and GLCore modules should not be used in combination with their glx module. So be sure to comment those out-- maybe the source of your trouble here. Good luck, Andrew. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Still having random crashes after installing X with nvidia
[EMAIL PROTECTED] wrote: I finally got X up with the nvidia drivers! Lovely screen. I had no idea my monitor was capable of such high resolution. It's now my favorite machine to work on. Despite the crashes. Well, the crashes *do* gt wearying -- it tends to crash acter somehwere between one and thirty minutes of X use. I've done a bit of testing using XDMCP. The AMF64 machine seems to crash when it is the X server. When it is just the X client (I log into it from elsewhere), is seems to work fine. Unless I get some help from this list, the next thing I'll be trying is going throuth the list of optional addons to X that dpkg-reconfigure xorg-server provides and taking them out. Things like direct rendering and GL support and such. Is this likely to keep nvidia's openGL suport from working? -- hendrik -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Still having random crashes after installing X with nvidia
On Mon, May 22, 2006 at 08:48:44AM -0400, [EMAIL PROTECTED] wrote: Unless I get some help from this list, the next thing I'll be trying is going throuth the list of optional addons to X that dpkg-reconfigure xorg-server provides and taking them out. Things like direct rendering and GL support and such. Is this likely to keep nvidia's openGL suport from working? Well! turning all that stuff off stops the crashes. Would it be useful trying to isolate the problem to one of those options, maybe by a kind of binary search? I.e. are there people interested in getting a report and trying to fix the problem, or are the comm channels to nvidia (I presume that's where the reports have to go) hopelessly clogged? -- hendrik -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: random crashes after installing X with nvidia
I was a long time ago that I started this thread. Sorry for the delay, but I had to deal with a few domestic emergencies. The problem still needs fixing. I originally wrote: I finally got X up with the nvidia drivers! Lovely screen. I had no idea my monitor was capable of such high resolution. It's now my favorite machine to work on. Despite the crashes. Every now and then it just stope, and becomes completely unresponsive to mouse and keyboard input, and ignores attempts to ssh in from another box. Hardware-level reset is the only way to recover. Not too painful, because of he reiser file system, but definitely disturbing. Before X, it just never crashed, and has been runnin as a NFS file server to the other machines on the LAN. The nvidia kernel I use is nvidia-kernel-2.6.12-1-amd64-generic_1.0.8756-1_amd64.deb I installed a new kernel, am now running vmlinuz-2.6.15-1-amd64-generic with nvidia-kernel-2.6.15-1-amd64-generic_1.0.8756-4_amd64.deb but it hasn't helped. I'm back to 2.6.12, because 2.6.15 occaionally causes it to fail to have any network connectivity. (It doesn't recognise the onboard ethernet, so I'm using a PCI ethernet card instead. Could 2.6.15 be recognising both and having trouble? That's an investigation for another day, I suspect. Unless the crashes are really some sort of networking problem, of course.) with nvidia-glx_1.0.8756-1_amd64.deb The kernel is vmlinuz-2.6.12-1-amd64-generic obtained from Len Sorensen's site (did I remember the spelling of his name correctly? Once, only once, when it crashed I still had a functioning keyboard -- enough to ctrl-alt-F1 to a root console, where I saw a flurry of messages about eth0 -- complaining about not haveing access to it and wondering whether another device might be competing for interrupts. I'd like to get this fixed. Is this a known problem and I should upgrade kernel and driver? Is it likely to be hardware (which would be awkward because the warranty on the assembled board is from a company that, although it supported Linux, is now defunct. What diagnostic information should I be collecting? Are there any other very maintenance actions (in the words of Dave Barry :))? -- hendrik And I added: Should have mentioned: I'm running etch, installed from Len's netinstall CD. I get updates from csail at MIT. And now from the regular Debian mirrors. And received an offlist reply: On Apr 23, 2006 at 04:38:33PM -0400, Cakey (jon) wrote: Would you mind posting your /etc/X11/xorg.conf as well as your /var/log/Xorg.0.log ? Sorry for the delay, but here they are: /etc/X11/xorg.conf: --- # xorg.conf (Xorg X Window System server configuration file) # # This file was generated by dexconf, the Debian X Configuration tool, using # values from the debconf database. # # Edit this file with caution, and see the xorg.conf manual page. # (Type man xorg.conf at the shell prompt.) # # This file is automatically updated on xserver-xorg package upgrades *only* # if it has not been modified since the last upgrade of the xserver-xorg # package. # # If you have edited this file but would like it to be automatically updated # again, run the following commands as root: # # cp /etc/X11/xorg.conf /etc/X11/xorg.conf.custom # md5sum /etc/X11/xorg.conf /var/lib/xfree86/xorg.conf.md5sum # dpkg-reconfigure xserver-xorg Section Files FontPathunix/:7100# local font server # if the local font server has problems, we can fall back on these FontPath/usr/lib/X11/fonts/misc FontPath/usr/lib/X11/fonts/cyrillic FontPath/usr/lib/X11/fonts/100dpi/:unscaled FontPath/usr/lib/X11/fonts/75dpi/:unscaled FontPath/usr/lib/X11/fonts/Type1 FontPath/usr/lib/X11/fonts/CID FontPath/usr/lib/X11/fonts/100dpi FontPath/usr/lib/X11/fonts/75dpi EndSection Section Module Loadbitmap Loaddbe Loadddc Loaddri Loadevdev Loadextmod Loadfreetype Loadglx Loadint10 Loadrecord Loadtype1 Loadvbe EndSection Section InputDevice Identifier Generic Keyboard Driver keyboard Option CoreKeyboard Option XkbRules xorg Option XkbModel pc104 Option XkbLayout us EndSection Section InputDevice Identifier Configured Mouse Driver mouse Option CorePointer Option Device/dev/input/mice Option Protocol ImPS/2 Option Emulate3Buttons true Option ZAxisMapping 4 5 EndSection Section Device Identifier Generic Video Card Driver
random crashes after installing X with nvidia
I finally got X up with the nvidia drivers! Lovely screen. I had no idea my monitor was capable of such high resolution. It's now my favorite machine to work on. Despite the crashes. Every now and then it just stope, and becomes completely unresponsive to mouse and keyboard input, and ignores attempts to ssh in from another box. Hardware-level reset is the only way to recover. Not too painful, because of he reiser file system, but definitely disturbing. Before X, it just never crashed, and has been runnin as a NFS file server to the other machines on the LAN. The nvidia kernel I use is nvidia-kernel-2.6.12-1-amd64-generic_1.0.8756-1_amd64.deb with nvidia-glx_1.0.8756-1_amd64.deb The kernel is vmlinuz-2.6.12-1-amd64-generic obtained from Len Sorensen's site (did I remember the spelling of his name correctly? Once, only once, when it crashed I still had a functioning keyboard -- enough to ctrl-alt-F1 to a root console, where I saw a flurry of messates about eth0 -- complaining about not haveing access to it and wondering whether another device might be competing for interrupts. I'd like to get this fixed. Is this a known problem and I should upgrade kernel and driver? Is it likely to be hardware (which would be awkward because the warranty on the assembled bor is from a company that, although it supported Linux, is now defunct. What diagnostic information should I be collecting? Are there any other very maintenance actions (in the words of Dave Barry :))? -- hendrik -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: random crashes after installing X with nvidia
On Sun, Apr 23, 2006 at 09:09:35AM -0400, [EMAIL PROTECTED] wrote: I finally got X up with the nvidia drivers! Lovely screen. I had no idea my monitor was capable of such high resolution. It's now my favorite machine to work on. Despite the crashes. Every now and then it just stope, and becomes completely unresponsive to mouse and keyboard input, and ignores attempts to ssh in from another box. Hardware-level reset is the only way to recover. Not too painful, because of he reiser file system, but definitely disturbing. Before X, it just never crashed, and has been runnin as a NFS file server to the other machines on the LAN. The nvidia kernel I use is nvidia-kernel-2.6.12-1-amd64-generic_1.0.8756-1_amd64.deb with nvidia-glx_1.0.8756-1_amd64.deb The kernel is vmlinuz-2.6.12-1-amd64-generic obtained from Len Sorensen's site (did I remember the spelling of his name correctly? Should have mentioned: I'm running etch, installed from Len's netinstall CD. I get updates from csail at MIT. Once, only once, when it crashed I still had a functioning keyboard -- enough to ctrl-alt-F1 to a root console, where I saw a flurry of messates about eth0 -- complaining about not haveing access to it and wondering whether another device might be competing for interrupts. I'd like to get this fixed. Is this a known problem and I should upgrade kernel and driver? Is it likely to be hardware (which would be awkward because the warranty on the assembled bor is from a company that, although it supported Linux, is now defunct. What diagnostic information should I be collecting? Are there any other very maintenance actions (in the words of Dave Barry :))? -- hendrik -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]