Changes from previouis version * Replaced the late, action-time recheck for MF_MSG_KERNEL_HIGH_ORDER with early failure classification inside get_any_page() (new patch 2/4).
* David Hildenbrand pointed out that the recheck inspected refcount and folio mapping without holding a folio reference, which is unsafe (concurrent split can trigger VM_WARN_ON_FOLIO). * Lance Yang suggested moving the disambiguation to the call site that still knows *why* the page reference could not be taken, which is what this version does via a new enum mf_get_page_status (MF_GET_PAGE_OK / RACE / UNHANDLABLE) plumbed out through get_hwpoison_page(). Signed-off-by: Breno Leitao <[email protected]> --- Changes in v6: - Dropped the selftest given the value was not clear - Get the status of the failure from get_any_page() - Small nits from different people/AIs. - Link to v5: https://patch.msgid.link/[email protected] Changes in v5: - Add vm.panic_on_unrecoverable_memory_failure sysctl to panic on unrecoverable kernel page hwpoison events (reserved pages, refcount-0 non-buddy pages, unknown state), with a recheck to avoid racing with concurrent buddy allocations. (Miaohe) - Distinguish reserved pages as MF_MSG_KERNEL in memory_failure(), document the new sysctl in Documentation/admin-guide/sysctl/vm.rst, and add a selftest verifying SIGBUS recovery on userspace pages still works when the sysctl is enabled. (Miaohe) - Added a selftest - Link to v4: https://patch.msgid.link/[email protected] Changes in v4: - Drop CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC kernel configuration option. - Split the reserved page classification (MF_MSG_KERNEL) into its own patch, separate from the panic mechanism. - Document why the buddy allocator TOCTOU race (between get_hwpoison_page() and is_free_buddy_page()) cannot cause false positives: PG_hwpoison is set beforehand and check_new_page() in the page allocator rejects hwpoisoned pages. - Document the narrow LRU isolation race window for MF_MSG_UNKNOWN and its mitigation via identify_page_state()'s two-pass design. - Explicitly document why MF_MSG_GET_HWPOISON is excluded from the panic conditions (shared path with transient races and non-reserved kernel memory). - Link to v3: https://patch.msgid.link/[email protected] Changes in v3: - Rename is_unrecoverable_memory_failure() to panic_on_unrecoverable_mf() as suggested by maintainer. - Add CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC kernel configuration option, similar to CONFIG_BOOTPARAM_HARDLOCKUP_PANIC. - Add documentation for the sysctl and CONFIG option. - Add code comments documenting the panic condition design rationale and how the retry mechanism mitigates false positives from buddy allocator races. - Link to v2: https://patch.msgid.link/[email protected] Changes in v2: - Panic on MF_MSG_KERNEL, MF_MSG_KERNEL_HIGH_ORDER and MF_MSG_UNKNOWN instead of MF_MSG_GET_HWPOISON. - Report MF_MSG_KERNEL for reserved pages when get_hwpoison_page() fails instead of MF_MSG_GET_HWPOISON. - Link to v1: https://patch.msgid.link/[email protected] To: Miaohe Lin <[email protected]> To: Naoya Horiguchi <[email protected]> To: Andrew Morton <[email protected]> To: Steven Rostedt <[email protected]> To: Masami Hiramatsu <[email protected]> To: Mathieu Desnoyers <[email protected]> To: Jonathan Corbet <[email protected]> To: Shuah Khan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] --- Breno Leitao (4): mm/memory-failure: report MF_MSG_KERNEL for reserved pages mm/memory-failure: classify get_any_page() failures by reason mm/memory-failure: add panic option for unrecoverable pages Documentation: document panic_on_unrecoverable_memory_failure sysctl Documentation/admin-guide/sysctl/vm.rst | 70 ++++++++++++++++++++++++++ include/trace/events/memory-failure.h | 2 +- mm/memory-failure.c | 89 ++++++++++++++++++++++++++++++--- 3 files changed, 152 insertions(+), 9 deletions(-) --- base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83 change-id: 20260323-ecc_panic-4e473b83087c Best regards, -- Breno Leitao <[email protected]>
