Re: hardware errors
On 10/06/14 23:24, Ralf Mardorf wrote: On Tue, 2014-06-10 at 23:07 +1200, Richard Hector wrote: On 10/06/14 23:04, Ralf Mardorf wrote: On Tue, 2014-06-10 at 21:24 +1200, Richard Hector wrote: On 09/06/14 11:35, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. Thanks. Have created a memtest86+ CD and will try it tomorrow evening (need a scheduled time to take it down). Interestingly, there are no more errors logged for the last day and a half ... Any guesses as to how long these 3 complete cycles will take? It's a Sun Fire X2100 M2 (dual core opteron 1218, 2600MHz) with 4G of RAM. I haven't run memtest for ages ... IIRC one complete standard test with my dual-core Athlon 2.1 GHz 4 GiB RAM takes more than 1 hour. I guess in 1 day it does around 8 complete tests, perhaps I run it just during the night in half of a day. I might be mistaken, but you should expect that you need to run it for several hours. Thanks. I'm not sure how long we can afford to leave the machine down; hopefully the error will show up promptly. BTW - it will show an error even if ECC corrects it, right? No ECC here. I don't know. I used StartPage and searched for memtest ECC. It seems to be, that memetst isn't good to test ECC. The current version seems to provide very limited hardware, seemingly Intel only. Yep. Halfway through the third pass; no errors yet. I'm not holding my breath. Any ideas on where to read up on those error messages, to figure out what they actually mean? Richard -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/53981f2d.9070...@walnut.gen.nz
Re: hardware errors
On 11/06/14 10:19, Richard Hector wrote: On 10/06/14 23:24, Ralf Mardorf wrote: On Tue, 2014-06-10 at 23:07 +1200, Richard Hector wrote: On 10/06/14 23:04, Ralf Mardorf wrote: On Tue, 2014-06-10 at 21:24 +1200, Richard Hector wrote: On 09/06/14 11:35, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. Thanks. Have created a memtest86+ CD and will try it tomorrow evening (need a scheduled time to take it down). Interestingly, there are no more errors logged for the last day and a half ... Any guesses as to how long these 3 complete cycles will take? It's a Sun Fire X2100 M2 (dual core opteron 1218, 2600MHz) with 4G of RAM. I haven't run memtest for ages ... IIRC one complete standard test with my dual-core Athlon 2.1 GHz 4 GiB RAM takes more than 1 hour. I guess in 1 day it does around 8 complete tests, perhaps I run it just during the night in half of a day. I might be mistaken, but you should expect that you need to run it for several hours. Thanks. I'm not sure how long we can afford to leave the machine down; hopefully the error will show up promptly. BTW - it will show an error even if ECC corrects it, right? No ECC here. I don't know. I used StartPage and searched for memtest ECC. It seems to be, that memetst isn't good to test ECC. The current version seems to provide very limited hardware, seemingly Intel only. Yep. Halfway through the third pass; no errors yet. I'm not holding my breath. Any ideas on where to read up on those error messages, to figure out what they actually mean? Richard Is it the cpu cache rather than ram? https://bugzilla.kernel.org/show_bug.cgi?id=43205 https://bbs.archlinux.org/viewtopic.php?id=112113 rob -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/53982de7.6050...@rektau.ukfsn.org
Re: hardware errors
On Wed, Jun 11, 2014 at 6:19 PM, Richard Hector rich...@walnut.gen.nz wrote: [...] Yep. Halfway through the third pass; no errors yet. I'm not holding my breath. Any ideas on where to read up on those error messages, to figure out what they actually mean? Richard Don't know about other people, but when the memory subsystem starts giving me grief, I generally vacuum around the motherboard and other internal stuff, re-seat the cable connectors, and pop the memory and I/O boards out, clean the contacts, and re-seat them, too. -- Joel Rees Be careful where you see conspiracy. Look first in your own heart. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/caar43imnq7h94dlwdrzj3kbqiyonsry+zc4hc464rdtacom...@mail.gmail.com
Re: hardware errors
On Wed, 2014-06-11 at 21:01 +0900, Joel Rees wrote: On Wed, Jun 11, 2014 at 6:19 PM, Richard Hector rich...@walnut.gen.nz wrote: [...] Yep. Halfway through the third pass; no errors yet. I'm not holding my breath. Any ideas on where to read up on those error messages, to figure out what they actually mean? Richard Don't know about other people, but when the memory subsystem starts giving me grief, I generally vacuum around the motherboard and other internal stuff, re-seat the cable connectors, and pop the memory and I/O boards out, clean the contacts, and re-seat them, too. I tend to use compressed air instead of vacuum, but the effect is the same ;). Unmounting and remounting is a good advice, cleaning usually isn't needed. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1402492271.11529.36.camel@archlinux
Debian memtest package faulty? (was ... Re: hardware errors)
On Mon, Jun 09, 2014 at 01:45:53AM +0200, Ralf Mardorf wrote: On Mon, 2014-06-09 at 01:35 +0200, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. I would use the memtest live media instead, so you're aware that you always get the current version from upstream. On my machine memtest from Debian and Ubuntu fails, while same versions from the live media don't fail. So you are saying the Debian and Ubuntu versions are buggy? Which live media version works for you? Have you filed a bug? Does the memtest86+ package work from the grub menu, for you? -- If you're not careful, the newspapers will have you hating the people who are being oppressed, and loving the people who are doing the oppressing. --- Malcolm X -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20140610090654.GH3560@tal
Re: hardware errors
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/14 11:35, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. Thanks. Have created a memtest86+ CD and will try it tomorrow evening (need a scheduled time to take it down). Interestingly, there are no more errors logged for the last day and a half ... Any guesses as to how long these 3 complete cycles will take? It's a Sun Fire X2100 M2 (dual core opteron 1218, 2600MHz) with 4G of RAM. I haven't run memtest for ages ... Richard -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Icedove - http://www.enigmail.net/ iQEcBAEBAgAGBQJTls7IAAoJELSi8I/scBaNKH4H/iNkz4dsMv3Vuo/rEmlbSeTh 4A4hqUyxKb6VqmljTm6KVEP/CCEWtu4E4MBfXg10OACGNPfaI0qbEZ2sBla3z0r6 sGYTsdem4gJZMeSV03BUc95Sw2T4HAH8Kd92OUanoZWvzh22YwKp3f0Sl/Eqakqo M7HOL4QwvmMdvarnPfyiXn/Vc2YGP/U+lx9ueiOQb+YGdYa4VCi6FSFKn1+S6TJH uFjaMevhcHd1WfdAISIYSdCLK3IgK/6pvDyqdnNCaPF/3w1DDhNxvygQfw9IPbkj zowhxQfh3DiZobWjSKjsurFBBXqmShOr1VvPDMu2OxKwHTDh1cdR6OswgmKS7hU= =RvXN -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5396cec8.7010...@walnut.gen.nz
Re: Debian memtest package faulty? (was ... Re: hardware errors)
On Tue, 2014-06-10 at 21:06 +1200, Chris Bannister wrote: So you are saying the Debian and Ubuntu versions are buggy? Which live media version works for you? Have you filed a bug? Does the memtest86+ package work from the grub menu, for you? I can't say if it would work from the GRUB menu now, but it didn't for older Debain and Ubuntu installs. No, I didn't file a bug report, but reported it at least to one *buntu devel mailing list. Btw. I don't know if memtest from Arch Linux would work, I simply never installed it to my Debians/*buntus and other installs anymore. There's a live media from memtest. I'm to lazy to search for the link. I got false positives. A perfect working machine got RAM errors already when starting the test, but when using the memtest from the memtest live media, there were no errors when running the test several times, each day for around a day. This was repeatable. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1402398009.2813.29.camel@archlinux
Re: hardware errors
On Tue, 2014-06-10 at 21:24 +1200, Richard Hector wrote: On 09/06/14 11:35, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. Thanks. Have created a memtest86+ CD and will try it tomorrow evening (need a scheduled time to take it down). Interestingly, there are no more errors logged for the last day and a half ... Any guesses as to how long these 3 complete cycles will take? It's a Sun Fire X2100 M2 (dual core opteron 1218, 2600MHz) with 4G of RAM. I haven't run memtest for ages ... IIRC one complete standard test with my dual-core Athlon 2.1 GHz 4 GiB RAM takes more than 1 hour. I guess in 1 day it does around 8 complete tests, perhaps I run it just during the night in half of a day. I might be mistaken, but you should expect that you need to run it for several hours. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1402398283.2813.33.camel@archlinux
Re: hardware errors
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/06/14 23:04, Ralf Mardorf wrote: On Tue, 2014-06-10 at 21:24 +1200, Richard Hector wrote: On 09/06/14 11:35, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. Thanks. Have created a memtest86+ CD and will try it tomorrow evening (need a scheduled time to take it down). Interestingly, there are no more errors logged for the last day and a half ... Any guesses as to how long these 3 complete cycles will take? It's a Sun Fire X2100 M2 (dual core opteron 1218, 2600MHz) with 4G of RAM. I haven't run memtest for ages ... IIRC one complete standard test with my dual-core Athlon 2.1 GHz 4 GiB RAM takes more than 1 hour. I guess in 1 day it does around 8 complete tests, perhaps I run it just during the night in half of a day. I might be mistaken, but you should expect that you need to run it for several hours. Thanks. I'm not sure how long we can afford to leave the machine down; hopefully the error will show up promptly. BTW - it will show an error even if ECC corrects it, right? Richard -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Icedove - http://www.enigmail.net/ iQEcBAEBAgAGBQJTlubqAAoJELSi8I/scBaNbSgH/iXi1PZaVOhg0aTI3HyMwL4k 4ZRwEy0BwWwj3oiVwCq7c9rLISjtIohnqlblbhJ0dEEaYC1lD4EfWfwEA6R/GffI +HGjczNDXjunvQMJUSeOdfoRu8hcysV67CKffLCLsSfAeRkbFLTJ0y6Wa9aTSfhm mHOgmls6vyU+UrQP0rvv2rET/AevKESf727FJwICNaXZYCfZ3CmEBrarztX6hgHn +eYJp3gPDlXhBkLIu8qS0wrnGqNdSDqS135yPWZGPYUghcgIwJRePwVurXt9NIs+ IFT7GtCjTeHeMl90sAalPJMWkxZbpBje0LSHoAZPvQd4NNx1/wWhh+1PV/3CE+w= =9C5Q -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5396e6eb.4080...@walnut.gen.nz
Re: Debian memtest package faulty? (was ... Re: hardware errors)
On Tue, 2014-06-10 at 13:00 +0200, Ralf Mardorf wrote: On Tue, 2014-06-10 at 21:06 +1200, Chris Bannister wrote: So you are saying the Debian and Ubuntu versions are buggy? Which live media version works for you? Have you filed a bug? Does the memtest86+ package work from the grub menu, for you? I can't say if it would work from the GRUB menu now, but it didn't for older Debain and Ubuntu installs. No, I didn't file a bug report, but reported it at least to one *buntu devel mailing list. Btw. I don't know if memtest from Arch Linux would work, I simply never installed it to my Debians/*buntus and other installs anymore. There's a live media from memtest. I'm to lazy to search for the link. I got false positives. A perfect working machine got RAM errors already when starting the test, but when using the memtest from the memtest live media, there were no errors when running the test several times, each day for around a day. This was repeatable. ^^^ time for around a day -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1402398650.2813.37.camel@archlinux
Re: hardware errors
On Tue, 2014-06-10 at 23:07 +1200, Richard Hector wrote: On 10/06/14 23:04, Ralf Mardorf wrote: On Tue, 2014-06-10 at 21:24 +1200, Richard Hector wrote: On 09/06/14 11:35, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. Thanks. Have created a memtest86+ CD and will try it tomorrow evening (need a scheduled time to take it down). Interestingly, there are no more errors logged for the last day and a half ... Any guesses as to how long these 3 complete cycles will take? It's a Sun Fire X2100 M2 (dual core opteron 1218, 2600MHz) with 4G of RAM. I haven't run memtest for ages ... IIRC one complete standard test with my dual-core Athlon 2.1 GHz 4 GiB RAM takes more than 1 hour. I guess in 1 day it does around 8 complete tests, perhaps I run it just during the night in half of a day. I might be mistaken, but you should expect that you need to run it for several hours. Thanks. I'm not sure how long we can afford to leave the machine down; hopefully the error will show up promptly. BTW - it will show an error even if ECC corrects it, right? No ECC here. I don't know. I used StartPage and searched for memtest ECC. It seems to be, that memetst isn't good to test ECC. The current version seems to provide very limited hardware, seemingly Intel only. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1402399473.2813.44.camel@archlinux
hardware errors
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, I'm seeing this kind of thing in kern.log: http://paste.debian.net/104039/ I've never seen these messages before IIRC, so I'm not entirely sure if I'm interpreting them correctly. It looks like some messages are telling me about RAM ECC errors, and others perhaps about cache? On the CPU? Or is it all RAM errors, detected at different places? I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Thanks, Richard -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Icedove - http://www.enigmail.net/ iQEcBAEBAgAGBQJTlPAjAAoJELSi8I/scBaNrQkIAIjJvLjIhTGb1pwC1X469who ZBGUHUc6J5RVf6jrjU2ivEigTEN+D5hkp8xrPhmR16mvfp8F1yo7xx4oe9GUu4SB XdgrMUTEmfX+lBZAVecMilUga/fs3Rdxyd7DqAfDW+b3aTUY6RvOkladJCpaADQn tE5tby/ruM7ZsIbzDvEvypo8byj6pQh52Kx6Gv51d91/InN/fpdANYHYSKFI4d9e XNCo5WopRK3C94KDQu942HxL7jaivTWbHk5qkWw2zpyjmnTO2dwtuQTtRMedPYgY oYtsWFaRf+rJ6NUrOBgVStHaa243H6jbslsubVn3mtuOTqtrVP+qo9AS13J7YYw= =WVMI -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5394f031.6000...@walnut.gen.nz
Re: hardware errors
On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. -- Pierre : pfff, look at the window, kids spending their life biking outsideā¦ tom : they don't have a pc or what ? signature.asc Description: PGP signature
Re: hardware errors
On Mon, 2014-06-09 at 01:35 +0200, B wrote: On Mon, 09 Jun 2014 11:22:25 +1200 Richard Hector rich...@walnut.gen.nz wrote: I assume the RAM needs replacing - is it possible to figure out which DIMM(s)? Install memtest86+ and boot on it, then leave at least 3 complete cycles to run. I would use the memtest live media instead, so you're aware that you always get the current version from upstream. On my machine memtest from Debian and Ubuntu fails, while same versions from the live media don't fail. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1402271153.8886.4.camel@archlinux