On Feb 1, 2017, at 1:45 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Israel Brewster <isr...@ravnalaska.net> writes: >> So just a bit ago I ran into a bit of excitement when the kernel decided >> to kill one of my postmaster processes due to an out-of-memory issue, > > Fun :-( > >> So a single postmaster process was using over 72GB of ram. > > No, the kernel was blaming it for 72GB, which is an entirely different > statement. The Linux OOM killer makes some assumptions that are > ludicrously wrong for Postgres: not only does it blame a parent process > for the total memory consumption of all its children, but if the children > share a large shared memory segment, *it counts the shared memory segment > over again for each child*. At least this was true last I looked; > perhaps very recent kernels are a bit less insane about shared memory. > In any case, the core problem is blaming the parent process for the > sins of a child. > > Now the PG postmaster itself consumes very little memory, and this is > quite unlikely to suddenly go wrong because it doesn't do very much. > A child backend process might go crazy, but what you want to happen then > is for the OOM killer to kill the child process not the postmaster. > That will still result in a database crash/restart scenario, but as long > as the postmaster is alive everything should recover automatically. > > Your problem, then, is that the OOM killer is egregiously and with malice > aforethought killing the wrong process. > > The usual fix for this is to configure things so that the postmaster is > excluded from OOM kill but its children aren't. See > https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT > (but be sure to consult the page for your PG version, as we've changed > the support mechanism for that in the past.) > > If you're using a vendor-supplied packaging of PG and it doesn't have some > easy way to turn on this behavior, complain to the vendor ... > > regards, tom lane >
Thanks for the explanation. This is a CentOS 6 box, kernel 2.6.32-642.11.1.el6.x86_64, running the PostgreSQL supplied Postgres 9.6.1, so hopefully the information on that page applies. I'll mess around with modifying the init.d script to exclude the postmaster process. Thanks again! ----------------------------------------------- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 ----------------------------------------------- > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general