On 22.08.24 10:24, Michal Hocko wrote:
On Thu 22-08-24 19:57:41, Barry Song wrote:
Regarding the concern about 'leaving locks
behind' you have in that subthread,  I believe there's no difference
when returning NULL, as it could still leave locks behind but offers
a chance for the calling process to avoid an immediate crash.

Yes, I have mentioned this risk just for completeness. Without having
some sort of unwinding mechanism we are doomed to not be able to handle
this.

The sole difference between just returning NULL and OOPsing rigth away
is that the former is not guaranteed to happen and the caller can cause
an actual harm by derefering non-oopsing addressed close to 0 which
would be a) much harder to find out b) could cause much more damage than
killing the context right away.

Besides that I believe we have many BUG_ON users which would really
prefer to just call the current context instead, they just do not have
means to do that so OOPS_ON could be a safer way to stop bad users and
reduce the number of BUG_ONs as well.

To me that sounds better as well, but I was also wondering if it's easy to implement or easy to assemble from existing pieces.


Linus has a point that "retry forever" can also be nasty. I think the important part here is, though, that we report sufficient information (stacktrace), such that the problem can be debugged reasonably well, and not just having a locked-up system.

But then the question is: does it really make sense to differentiate difference between an NOFAIL allocation under memory pressure of MAX_ORDER compared to MAX_ORDER+1 (Linus also touched on that)? It could well take minutes/hours/days to satisfy a very large NOFAIL allocation. So callers should be prepared to run into effective lockups ... :/

NOFAIL shouldn't exist, or at least not used to that degree.

I am to blame myself, I made use of it in kernel/resource.c, where there is no turning back when completed memory unplug to 99% (even having freed the vmemmap), but then we might have to allocate a new node in the resource tree, when having to split an existing one. Maybe there would be ways to preallocate before starting memory unplug, or to pre-split ...

But then again, sizeof(struct resource) is probably so small that it likely would never fail.

--
Cheers,

David / dhildenb


Reply via email to