On 2020-09-17 17:27, HORIGUCHI NAOYA wrote:
Sorry, I modified the patches based on the different assumption from
yours.
I firstly thought of taking page off after confirming the error page
is freed back to buddy. This approach leaves the possibility of reusing
the error page (which is acceptable), but simpler and less invasive
one.
Your approach removes the error page from page allocator's control in
freeing time. It has no possibility of reusing the error page but
changes
are tightly coupled with page free code.
This is a tradeoff between complexity and completeness of soft offline,
Now I'm not sure I could persist on my own opinion without providing
working code, and it's OK for me to take your one.
Yeah, you are right it is a trade off.
I would suggest taking this path now, and if it proofs to be problematic
in some way, we can always
do the:
free_page
take_it_off_buddy
OK: mark it as hwpoison and increment refcount
NOT_OK (raced with allocation): oops, sorry
The test passed in my environment, so this is fine.
Thanks for trying it out.
If they do, I will try to see if Andrew can squezee above changes into
[1],
where they belong to.
Yes, proposing the fix for
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
seems fine to me.
Again, sorry for modifying code without asking.
No worries, I wil do a couple of tests on my own and then I will talk to
Andrew to see if we can squeeze the changes in there.