On Sat, Feb 29, 2020 at 9:49 AM Dale <rdalek1...@gmail.com> wrote:
>
> I have noticed the OOM killing the wrong thing as well.  In a way, how
> does it know what it should kill really???  After all, the process using
> the most memory may not be the problem but another one, or more, could.
> I guess in most cases the one using the most is the bad one but it may
> not always be the case.  I'm not sure how OOM could determine that tho.
> Maybe having some setting like you mentions would help.  It's a thought.

Oh, plenty of people have given thought to it.

The algorithm is actually not as well-documented as I thought it was.
Lots of documents, but they're fragmented.  Behavior is also
configurable.  For example, you can just tell Linux to panic on OOM,
or just have it kill the process that triggered OOM even if it isn't
using much memory.

Best docs I could find are at (among other places):
https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.txt#L1520

Aside from setting limits on services so that they die/restart before
overall system memory is threatened, adjusting oom_score_adj lets you
tweak overall priorities for any process.

By default it mostly comes down to what process is hogging the most
RAM, with slight preference for root-owned processes.

Really though setting resource limits is your best bet.  Then you can
set a threshold above normal use, and if something misbehaves it is
going to get restarted before most of RAM is taken.  User session
cgroups can of course be limited as well so that interactive processes
can't just go nuts and use all your RAM.

-- 
Rich

Reply via email to