Re: Debian ADM64 Etch (testing/unstable) system freeze
On Friday 03 February 2006 15:47, Anthony DeRobertis wrote: > Rami Saarinen wrote: > > Anyway, I am glad to inform that yes it really was the memory that was > > causing the trouble. I let the machine run the memtest86+ last night and > > after 10 hours it had found four memory errors. Apparently I was too > > hasty at the first time. > > Well, now you get the next fun step... verifying that the bad memory > didn't corrupt your system install, or your data. I think you said you > have ECC memory, so you're probably safe, but you should run debsums, > making sure it checks every package installed on your system (you'll > have to download copies of a bunch of the .deb's that don't include > md5sum information in them). Hey, thanks! I almost forgot this. Just can't wait for that fun to begin... :) Thanks to all for good help! -- Rami Saarinen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
Rami Saarinen wrote: > Anyway, I am glad to inform that yes it really was the memory that was > causing the trouble. I let the machine run the memtest86+ last night and > after 10 hours it had found four memory errors. Apparently I was too > hasty at the first time. Well, now you get the next fun step... verifying that the bad memory didn't corrupt your system install, or your data. I think you said you have ECC memory, so you're probably safe, but you should run debsums, making sure it checks every package installed on your system (you'll have to download copies of a bunch of the .deb's that don't include md5sum information in them). -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
On Tue, Jan 31, 2006 at 12:08:28AM -0800, Corey Hickey wrote: > Rami Saarinen wrote: > > Anyway, I am glad to inform that yes it really was the memory that was > > causing the trouble. I let the machine run the memtest86+ last night and > > after 10 hours it had found four memory errors. Apparently I was too > > hasty at the first time. > > > > I have one more stupid question: as it may take couple of days for me to > > get the new memory. Is there any way to block / reserve the faulty > > memory area so that it would not be available for use? > > If memtest86+ is consistently reporting a few addresses, then you can > use the badram kernel patch: > > http://rick.vanrein.org/linux/badram/ > > I had some very slight stability issues with my machine after I build > it, and memtest86+ reported one memory failure after I ran it for a > while. The problem turned out to be that my BIOS was, for some reason, > setting the memory timing (CAS/RAS/etc. -- I don't remember which) more > aggressively than the values at which the RAM was specced to operate. > So, if memtest86+ seems to be reporting random, sporadic failures, you > might try checking and increasing your memory timings. You might also want to try reseating the memory once or twice, and checking the cooling to make sure it isn't a heat problem. If you haven't already, that is. Cheers, a -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
Rami Saarinen wrote: > Anyway, I am glad to inform that yes it really was the memory that was > causing the trouble. I let the machine run the memtest86+ last night and > after 10 hours it had found four memory errors. Apparently I was too > hasty at the first time. > > I have one more stupid question: as it may take couple of days for me to > get the new memory. Is there any way to block / reserve the faulty > memory area so that it would not be available for use? If memtest86+ is consistently reporting a few addresses, then you can use the badram kernel patch: http://rick.vanrein.org/linux/badram/ I had some very slight stability issues with my machine after I build it, and memtest86+ reported one memory failure after I ran it for a while. The problem turned out to be that my BIOS was, for some reason, setting the memory timing (CAS/RAS/etc. -- I don't remember which) more aggressively than the values at which the RAM was specced to operate. So, if memtest86+ seems to be reporting random, sporadic failures, you might try checking and increasing your memory timings. -Corey -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
Anthony DeRobertis wrote: Rami Saarinen wrote: Well, somehow I assumed that if the fault is in memory, it is probably in a fixed location Depends on the type of memory problem. Memory problems can cover everything from "this one certain bit is stuck at 0" (what you're thinking of) to "the memory timings/voltage/whatever are off, memory functions as a hardware random number generater as a result." Yes, very true. Oh, and memory allocation is not random. The kernel is going to wind up in a certain spot every time. So will, e.g., init. Yes. Somehow I ended up thinking that if the fault is in the memory area the kernel uses, the faulty behaviour would be more devastating and would occur more ofter. After all I have ran the system for hours without a problem. Anyway, I am glad to inform that yes it really was the memory that was causing the trouble. I let the machine run the memtest86+ last night and after 10 hours it had found four memory errors. Apparently I was too hasty at the first time. I have one more stupid question: as it may take couple of days for me to get the new memory. Is there any way to block / reserve the faulty memory area so that it would not be available for use? Thanks again for help! -- Rami Saarinen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
Rami Saarinen wrote: > Well, somehow I assumed that if the fault is in memory, it is probably in a > fixed location Depends on the type of memory problem. Memory problems can cover everything from "this one certain bit is stuck at 0" (what you're thinking of) to "the memory timings/voltage/whatever are off, memory functions as a hardware random number generater as a result." Oh, and memory allocation is not random. The kernel is going to wind up in a certain spot every time. So will, e.g., init. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
On Friday 27 January 2006 23:58, Andrew Sharp wrote: > On Fri, Jan 27, 2006 at 01:57:59AM +0200, Rami Saarinen wrote: > > On Thursday 26 January 2006 15:21, Andrew Syrewicze wrote: > > > I wouldn't rule out the possibility of your processor getting to hot. > > > The newcastle cores aren't as solid as venice cores, and i hear they > > > run a little hotter too. I use a venice core and i can overclock the > > > crap out of that thing. (with a huge thermaltake fan on it of course ) > > > :-P. > > > > > > Anyway i would start by checking your cpu temp. I would first check in > > > BIOS. > > > > Froze two times again today. First time I was moving a 2.1 Gb file to > > another location on the disk and the second time I was doing the same as > > in the my previous post. This time I was lucky as there was actually some > > output. > > > > First time froze with: "kernel stack segment [1]" > > and the second: "general protection fault " > > > > Afrer reboot I checked the temperature from BIOS - 32 celsius, so it is > > not overheating issue. > > > > I doubt the memory issue also as I'd expect alternating symptoms like > > programs crashing etc. not just full system freeze every time. (?) > > Thanks for everyone for help. > > I don't know why you would assume that. Memory problems can cause > any/all of these symptoms, but don't have to cause any particular one. > It sure sounds like a hardware/memory problem to me. > Well, somehow I assumed that if the fault is in memory, it is probably in a fixed location and there could be variance of which program gets the faulty part. For example I might assume that typical memory error is that the value stored in the memory is changed when it is fetched and thus would cause various symptoms from rampant crashes to system freeze. But then again I am no memory expert. (Firefox does seem to be unstable at the moment). Anyway I am going to run memtest seriously this time and I am also trying to borrow some other memory to see if the problems persist. Thanks all for help. -- Rami Saarinen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
On Fri, Jan 27, 2006 at 01:57:59AM +0200, Rami Saarinen wrote: > On Thursday 26 January 2006 15:21, Andrew Syrewicze wrote: > > I wouldn't rule out the possibility of your processor getting to hot. The > > newcastle cores aren't as solid as venice cores, and i hear they run a > > little hotter too. I use a venice core and i can overclock the crap out of > > that thing. (with a huge thermaltake fan on it of course ) :-P. > > > > Anyway i would start by checking your cpu temp. I would first check in > > BIOS. > > Froze two times again today. First time I was moving a 2.1 Gb file to another > location on the disk and the second time I was doing the same as in the my > previous post. This time I was lucky as there was actually some output. > > First time froze with: "kernel stack segment [1]" > and the second: "general protection fault " > > Afrer reboot I checked the temperature from BIOS - 32 celsius, so it is not > overheating issue. > > I doubt the memory issue also as I'd expect alternating symptoms like > programs > crashing etc. not just full system freeze every time. (?) > Thanks for everyone for help. I don't know why you would assume that. Memory problems can cause any/all of these symptoms, but don't have to cause any particular one. It sure sounds like a hardware/memory problem to me. Cheers, a -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
On Thursday 26 January 2006 15:21, Andrew Syrewicze wrote: > I wouldn't rule out the possibility of your processor getting to hot. The > newcastle cores aren't as solid as venice cores, and i hear they run a > little hotter too. I use a venice core and i can overclock the crap out of > that thing. (with a huge thermaltake fan on it of course ) :-P. > > Anyway i would start by checking your cpu temp. I would first check in > BIOS. Froze two times again today. First time I was moving a 2.1 Gb file to another location on the disk and the second time I was doing the same as in the my previous post. This time I was lucky as there was actually some output. First time froze with: "kernel stack segment [1]" and the second: "general protection fault " Afrer reboot I checked the temperature from BIOS - 32 celsius, so it is not overheating issue. I doubt the memory issue also as I'd expect alternating symptoms like programs crashing etc. not just full system freeze every time. (?) Thanks for everyone for help. -- Rami Saarinen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Debian ADM64 Etch (testing/unstable) system freeze
I wouldn't rule out the possibility of your processor getting to hot. The newcastle cores aren't as solid as venice cores, and i hear they run a little hotter too. I use a venice core and i can overclock the crap out of that thing. (with a huge thermaltake fan on it of course ) :-P. Anyway i would start by checking your cpu temp. I would first check in BIOS.You might also try installing gkrellm. It's a nice program for system monitoring. Make sure you have acpi installed as well!!! You could also try UNDERclocking your processor, and if none of this works, try putting in another video card. Worst case it's your system board. (which i highly doubt).good luck -Andy