On Sat, Jan 04, 2020 at 04:38:19PM -0700, Chris Murphy wrote: > On Sat, Jan 4, 2020 at 2:51 AM Aleksandra Fedorova <al...@bookwar.info> wrote: > > > Since in the Change we are not introducing just the earlyoom tool but > > enable it with a specific profile I would add those details here. Smth like: > > > > "earlyoom service will choose the offending process based on the same > > oom_score as kernel uses. It will send a SIGTERM signal on 10% of RAM left, > > and SIGKILL on 5%" > > I add this information to the summary. Also, I think these numbers may > need to change to avoid prematurely sending SIGTERM when the system > has no swap device. > > > As I understand in the current setup we are looking more for a controlled > > failure scenario rather than for a solution. > > Yes, it's fair to say this proposal is to make things "less bad". It > doesn't improve system responsiveness. Once heavy swap starts, the > system is sluggish, stutters, and briefly stalls. This proposal > doesn't fix that. There is a lot of room for improvement. > > > > Can we get a specific manual, what users supposed to do, once they trigger > > the earlyoom? Does earlyoom help in reporting? Which logs we need to look > > at? > > > > Maybe add a section in UX part of the change, or setup a dedicated wiki > > page? > > The user shouldn't need to do anything differently than if the kernel > oom-killer had triggered. The system journal will contain messages > showing what was killed and why: > > Jan 04 16:05:42 fmac.local earlyoom[4896]: low memory! at or below > SIGTERM limits: mem 10 %, swap 10 % > Jan 04 16:05:42 fmac.local earlyoom[4896]: sending SIGTERM to process > 27421 "chrome": badness 305, VmRSS 42 MiB > > > > Additionally, there was a question during the chat discussion: how the > > earlyoom setup will work together with OOMPolicy and any other related > > options of systemd units? Will systemd recognize the OOM event? > > My understanding of systemd OOMPolicy= behavior, is it looks for the > kernel's oom-killer messages and acts upon those. Whereas earlyoom > uses the same metric (oom_score) as the oom-killer, it does not invoke > the oom-killer. Therefore systemd probably does not get the proper > hint to implement OOMPolicy=
Yes. The kernel reports oom events in the cgroup file memory.events, and systemd waits for an inotify event on that file; OOMPolicy=stop is implemented that way. And the OOMPolicy=kill option is "implemented" by setting memory.oom.group=1 in the kernel [1] and having the kernel kill all the processes. So systemd is providing a thin wrapper around the kernel functionality. If processes are not killed by the kernel but through a signal from userspace, all of this will not work. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files Zbyszek > Fedora need to discuss how big of a problem that is, if there's anyway > to mitigate it, or tolerate it, weighing the pros of earlyoom for a > short period, versus the cons of punting this problem for another > release. This proposal does not intend to step on other superseding > work in this area, but if it does, it'll be withdrawn. _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org