On Tue, Feb 12, 2019 at 12:11:11AM +0100, Hans van Kranenburg wrote:
> This means you will have to do things like hop on the upstream
> development mailing list, build a reproducable failure case, search for
> a developer that has similar hardware and wants to spend time on it,
> donate hardware to someone to reproduce the error scenarios or learn how
> to do it yourself, or whatever it takes. :)

I had hopes of avoiding doing such.  Problem is there are so many pieces
of software I have to use that if I jumped on the mailing lists of each
of them would be akin to trying to read all of Usenet.  I may not be able
to avoid that here, but...

Looks like Xen's MCE support is in near-useless shape.  The code in the
git repository mention documentation for family 10h, problem is that is
almost entirely decade-old processors.  The last apparently significant
change was in 2014.  The copyright is to AMD, so I guess that means they
need more funding.

Looks like Intel has been offering more support to Xen.  :-(

I'm surprised at Xen's handling of MCE.  Given Xen's approach to things I
would expect MCE handling to be done more by Domain 0.  Let Domain 0
handle talking to the memory controller and merely have Xen map the
physical address to a domain and domain address.  Domain 0 can log all
correctable memory errors to a single location, and in case of an
uncorrectable error it can panic the machine.  (plus Linux's MCE support
is in better shape)

Handling MCE errors in non-Domain 0 only seems to make sense in HVM where
you want to simulate memory errors.

(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sig...@m5p.com  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Reply via email to