When we use "/sys/devices/system/memory/soft_offline_page" to offline a *free* page, the value of mce_bad_pages will be added, and the page is set HWPoison flag, but it is still managed by page buddy alocator.
$ cat /proc/meminfo | grep HardwareCorrupted shows the value. If we offline the same page, the value of mce_bad_pages will be added *again*, this means the value is incorrect now. Assume the page is still free during this short time. soft_offline_page() get_any_page() "else if (is_free_buddy_page(p))" branch return 0 "goto done"; "atomic_long_add(1, &mce_bad_pages);" Changelog: V3: -add page lock when set HWPoison flag -adjust the function structure V2 and V1: -fix the error Xishi Qiu (2): move poisoned page check at the beginning of the function fix the function structure mm/memory-failure.c | 69 ++++++++++++++++++++++++++++----------------------- 1 files changed, 38 insertions(+), 31 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/