Jiaqi Yan <jiaqi...@google.com> writes:

> Correctable memory errors are very common on servers with large
> amount of memory, and are corrected by ECC, but with two
> pain points to users:
> 1. Correction usually happens on the fly and adds latency overhead
> 2. Not-fully-proved theory states excessive correctable memory
>    errors can develop into uncorrectable memory error.

This patchkit is amusing (or maybe sad) because it basically tries to
reconstruct the original soft offline design using a user space daemon
instead of doing policy badly in the kernel.

You can still have it by enabling CONFIG_X86_MCELOG_LEGACY and
use http://www.mcelog.org or an equivalent daemon of your chosing
that listens to /dev/mcelog.

-Andi


Reply via email to