* Sergio Gelato [2016-03-04 09:12:14 +0100]: > I still think there is a kernel issue here: mcelog shouldn't be able to > request the wrong page cache mode and spoil things for everyone else.
It turns out that mcelog, just like dmidecode, mmap()s portions of /dev/mem, which results in the pages being marked WB for the lifetime of the mapping, which can be short (dmidecode, mcelog --dmi) or long (mcelog --daemon). Any attempt to read from /sys/firmware/dmi/entries/*/raw while the pages are marked WB results in EINVAL. This is because dmi_remap() is an alias for ioremap(), and the latter is currently a wrapper around ioremap_nocache(). > (Or is it dmi_remap() that's asking for the wrong mode? I'm not quite sure: > if DMI data are non-volatile and read-only (are they always?) why shouldn't > they be cached?) In other words: in arch/x86/include/asm/dmi.h (and perhaps in arch/ia64/include/asm/dmi.h), would it be safe to #define dmi_remap ioremap_cache instead of the current definition? If the answer is yes, that should solve the problem. Otherwise it's the mmap code that may need adjusting. A workaround may be to teach mcelog (and dmidecode, while we're at it) to use the /sys/firmware/dmi interface when available.