On Tue, May 12, 2026 at 10:17:00AM +0200, David Hildenbrand (Arm) wrote:
>On 5/11/26 17:38, Breno Leitao wrote:
>> When get_hwpoison_page() returns a negative value, distinguish
>> reserved pages from other failure cases by reporting MF_MSG_KERNEL
>> instead of MF_MSG_GET_HWPOISON. Reserved pages belong to the kernel
>> and should be classified accordingly for proper handling.
>>
>> Sample PG_reserved before the get_hwpoison_page() call. In the
>> MF_COUNT_INCREASED path get_any_page() can drop the caller's
>> reference before returning -EIO, after which the underlying page may
>> have been freed and reallocated with page->flags reset; reading
>> PageReserved(p) at that point would observe stale or unrelated state.
>> The pre-call snapshot reflects what the page actually was at the
>> time of the failure event.
>>
>> Acked-by: Miaohe Lin <[email protected]>
>> Reviewed-by: Lance Yang <[email protected]>
>> Signed-off-by: Breno Leitao <[email protected]>
>> ---
>> mm/memory-failure.c | 19 ++++++++++++++++++-
>> 1 file changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 866c4428ac7ef..f112fb27a8ff6 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -2348,6 +2348,7 @@ int memory_failure(unsigned long pfn, int flags)
>> unsigned long page_flags;
>> bool retry = true;
>> int hugetlb = 0;
>> + bool is_reserved;
>>
>> if (!sysctl_memory_failure_recovery)
>> panic("Memory failure on page %lx", pfn);
>> @@ -2411,6 +2412,18 @@ int memory_failure(unsigned long pfn, int flags)
>> * In fact it's dangerous to directly bump up page count from 0,
>> * that may make page_ref_freeze()/page_ref_unfreeze() mismatch.
>> */
>> + /*
>> + * Pages with PG_reserved set are not currently managed by the
>> + * page allocator (memblock-reserved memory, driver reservations,
>> + * etc.), so classify them as kernel-owned for reporting.
>> + *
>> + * Sample the flag before get_hwpoison_page(): in the
>> + * MF_COUNT_INCREASED path, get_any_page() can drop the caller's
>> + * reference before returning -EIO, after which page->flags may
>> + * have been reset by the allocator.
>> + */
>> + is_reserved = PageReserved(p);
>> +
>> res = get_hwpoison_page(p, flags);
>> if (!res) {
>> if (is_free_buddy_page(p)) {
>> @@ -2432,7 +2445,11 @@ int memory_failure(unsigned long pfn, int flags)
>> }
>> goto unlock_mutex;
>> } else if (res < 0) {
>> - res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
>> + if (is_reserved)
>> + res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
>> + else
>> + res = action_result(pfn, MF_MSG_GET_HWPOISON,
>> + MF_IGNORED);
>> goto unlock_mutex;
>> }
>>
>>
>
>It's a bit odd that we need this handling when we already have handling for
>reserved pages in error_states[].
>
>HWPoisonHandlable() would always essentially reject PG_reserved pages. So
>__get_hwpoison_page() ... would always fail? Making
>get_hwpoison_page()->get_any_page() always fail?
>
>But then, we never call identify_page_state()? And never call me_kernel()?
Looks like we never get that far ...
>This all looks very odd.
>
>Why would you even want to call get_hwpoison_page() in the first place if you
>find PageReserved?
Ah, I see :)
For a PG_reserved page, I would not expect PageLRU to be set, nor would
I expect it to be in the buddy allocator.
include/linux/page-flags.h also says:
"
Once (if ever) freed, PG_reserved is cleared and they will be given to
the page allocator.
"
So maybe special-case PageReserved() before get_hwpoison_page()?
Something like:
if (PageReserved(p)){
res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
goto unlock_mutex;
}
res = get_hwpoison_page(p, flags, &gp_status);
Cheers, Lance