On Tuesday 21 September 2010, Stroller wrote:
> On 21 Sep 2010, at 18:37, Grant wrote:
> >>>> I'm getting a lot of machine check exception errors in dmesg on my
> >>>> hosted server.  Running mcelog I get:
> >>>> ...
> > 
> > They offered to take my machine down and do a memory test which they
> > said would take a number of hours.  Is a memory test likely to help?
> > Did you suggest reseating or replacing RAM modules as opposed to a
> > memory test because it will result in less downtime?
> 
> I suspect that your hosting provider are offering you this memory test
> because they don't want to go swapping out memory modules willy-nilly.
> 
> How do they know that the problem is really memory, and not your operating
> system? If they take all this RAM out and put new RAM in, what do they do
> with the old RAM? They don't know if it's good or bad, so are they
> expected to just slap it in a server belonging to another customer, and
> stitch him up?
> 
> A memory test is likely to identify bad RAM, if it is bad, so you should
> proceed with this. This is likely the best route to solving the problem.
> 

sure?
this is ecc ram - does memtest report ecc-corrected errors? i don't think so. 
The mce errors say:
we detected an error. Error was corrected. Applications will not see error. 
Everything marches on.

The ram is borked and must be replaced. 

Reply via email to