On Sat, Jan 04, 2020 at 04:38:19PM -0700, Chris Murphy wrote:
> On Sat, Jan 4, 2020 at 2:51 AM Aleksandra Fedorova <al...@bookwar.info> wrote:
> 
> > Since in the Change we are not introducing just the earlyoom tool but 
> > enable it with a specific profile I would add those details here. Smth like:
> >
> > "earlyoom service will choose the offending process based on the same 
> > oom_score as kernel uses. It will send a SIGTERM signal on 10% of RAM left, 
> > and SIGKILL on 5%"
> 
> I add this information to the summary. Also, I think these numbers may
> need to change to avoid prematurely sending SIGTERM when the system
> has no swap device.
> 
> > As I understand in the current setup we are looking more for a controlled 
> > failure scenario rather than for a solution.
> 
> Yes, it's fair to say this proposal is to make things "less bad". It
> doesn't improve system responsiveness. Once heavy swap starts, the
> system is sluggish, stutters, and briefly stalls. This proposal
> doesn't fix that. There is a lot of room for improvement.
> 
> 
> > Can we get a specific manual, what users supposed to do, once they trigger 
> > the earlyoom? Does earlyoom help in reporting? Which logs we need to look 
> > at?
> >
> > Maybe add a section in UX part of the change, or setup a dedicated wiki 
> > page?
> 
> The user shouldn't need to do anything differently than if the kernel
> oom-killer had triggered. The system journal will contain messages
> showing what was killed and why:
> 
> Jan 04 16:05:42 fmac.local earlyoom[4896]: low memory! at or below
> SIGTERM limits: mem 10 %, swap 10 %
> Jan 04 16:05:42 fmac.local earlyoom[4896]: sending SIGTERM to process
> 27421 "chrome": badness 305, VmRSS 42 MiB
> 
> 
> > Additionally, there was a question during the chat discussion: how the 
> > earlyoom setup will work together with OOMPolicy and any other related 
> > options of systemd units? Will systemd recognize the OOM event?
> 
> My understanding of systemd OOMPolicy= behavior, is it looks for the
> kernel's oom-killer messages and acts upon those. Whereas earlyoom
> uses the same metric (oom_score) as the oom-killer, it does not invoke
> the oom-killer. Therefore systemd probably does not get the proper
> hint to implement OOMPolicy=

Yes. The kernel reports oom events in the cgroup file memory.events,
and systemd waits for an inotify event on that file; OOMPolicy=stop is
implemented that way. And the OOMPolicy=kill option is "implemented"
by setting memory.oom.group=1 in the kernel [1] and having the kernel
kill all the processes. So systemd is providing a thin wrapper around
the kernel functionality.

If processes are not killed by the kernel but through a signal from
userspace, all of this will not work.

[1] 
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files

Zbyszek

> Fedora need to discuss how big of a problem that is, if there's anyway
> to mitigate it, or tolerate it, weighing the pros of earlyoom for a
> short period, versus the cons of punting this problem for another
> release. This proposal does not intend to step on other superseding
> work in this area, but if it does, it'll be withdrawn.
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

Reply via email to