On Wednesday 22 September 2010 02:24:39 Grant wrote:
> >> >>>> I'm getting a lot of machine check exception errors in dmesg on my
> >> >>>> hosted server.  Running mcelog I get:
> >> >>>> ...
> >> > 
> >> > They offered to take my machine down and do a memory test which they
> >> > said would take a number of hours.  Is a memory test likely to help?
> >> > Did you suggest reseating or replacing RAM modules as opposed to a
> >> > memory test because it will result in less downtime?
> >> 
> >> I suspect that your hosting provider are offering you this memory test
> >> because they don't want to go swapping out memory modules willy-nilly.
> >> 
> >> How do they know that the problem is really memory, and not your
> >> operating system? If they take all this RAM out and put new RAM in,
> >> what do they do with the old RAM? They don't know if it's good or bad,
> >> so are they expected to just slap it in a server belonging to another
> >> customer, and stitch him up?
> >> 
> >> A memory test is likely to identify bad RAM, if it is bad, so you should
> >> proceed with this. This is likely the best route to solving the problem.
> >> 
> >> I think that ideally, for you, they would move the system image onto a
> >> different known-good server with the same configuration. Then you cannot
> >> complain if the same problems start occurring again. If the problem is
> >> genuinely hardware then they won't. And the hosting provider is free to
> >> run diagnostics on your old machine.
> >> 
> >> But realistically, the memory test is likely to show up a bad RAM
> >> module, you'll get it replaced and be up and running within a few
> >> hours. Why would you refuse? If your system needed a guaranteed uptime
> >> you'd perhaps have to pay for a higher level of service than the fees
> >> you're paying at present.
> > 
> > I run memory tests overnight.  If a module is seriously borked then it
> > will fail earlier.  Reseating/replacing takes a few minutes, instead of
> > hours.
> > 
> > If they have spare machines (for dev't or testing) they can fit the
> > memory module(s) there and test them exhaustively, before they put the
> > good ones back into a customer's machine.
> 
> Thanks Mick and Stroller.  I'll see if they'll go for this.

You're welcome.  Bear in mind though that a lot of hosters are just glorified 
resellers with an account in a bigger data centre.  In many cases they do not 
even have physical access to the machines.  Only the data centre techies do 
and they may be less willing to oblige and break procedure or routine, just 
because one end user out of hundreds/thousands complained about some memory 
errors.

YMMV
-- 
Regards,
Mick

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to