> On Mar 13, 2023, at 9:28 AM, Adrian Klaver <adrian.kla...@aklaver.com> wrote: > > On 3/13/23 10:21 AM, Israel Brewster wrote: >> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit >> more memory constrained than I would like, such that every week or so the >> various processes running on the machine will align badly and the OOM killer >> will kick in, killing off postgresql, as per the following journalctl output: >> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process >> of this unit has been killed by the OOM killer. >> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed >> with result 'oom-kill'. >> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed >> 5d 17h 48min 24.509s CPU time. >> And the service is no longer running. >> When this happens, I go in and restart the postgresql service, and >> everything is happy again for the next week or two. >> Obviously this is not a good situation. Which leads to two questions: >> 1) is there some tweaking I can do in the postgresql config itself to >> prevent the situation from occurring in the first place? >> 2) My first thought was to simply have systemd restart postgresql whenever >> it is killed like this, which is easy enough. Then I looked at the default >> unit file, and found these lines: >> # prevent OOM killer from choosing the postmaster (individual backends will >> # reset the score to 0) >> OOMScoreAdjust=-900 >> # restarting automatically will prevent "pg_ctlcluster ... stop" from >> working, >> # so we disable it here. Also, the postmaster will restart by itself on most >> # problems anyway, so it is questionable if one wants to enable external >> # automatic restarts. >> #Restart=on-failure >> Which seems to imply that the OOM killer should only be killing off >> individual backends, not the entire cluster to begin with - which should be >> fine. And also that adding the restart=on-failure option is probably not the >> greatest idea. Which makes me wonder what is really going on? > > You might want to read: > > https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
Good information, thanks. One thing there confuses me though. It says: Another approach, which can be used with or without altering vm.overcommit_memory, is to set the process-specific OOM score adjustment value for the postmaster process to -1000, thereby guaranteeing it will not be targeted by the OOM killer Isn’t that exactly what the "OOMScoreAdjust=-900” line in the Unit file does though (except with a score of -900 rather than -1000)? --- Israel Brewster Software Engineer Alaska Volcano Observatory Geophysical Institute - UAF 2156 Koyukuk Drive Fairbanks AK 99775-7320 Work: 907-474-5172 cell: 907-328-9145 > >> Thanks. >> --- >> Israel Brewster >> Software Engineer >> Alaska Volcano Observatory >> Geophysical Institute - UAF >> 2156 Koyukuk Drive >> Fairbanks AK 99775-7320 >> Work: 907-474-5172 >> cell: 907-328-9145 > > > -- > Adrian Klaver > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>