On Mon, 2006-03-06 at 13:26 -0500, Matthew T. O'Connor wrote:
> Geo Carncross wrote:
> > Daemons also run as root. Root-owned processes shouldn't be killed so
> > quickly by the OOM killer, so getting them killed by other means (say,
> > reasonable resource limits) is even more important.
> 
> The OOM Killer should be killed itself.  I keep up with the PostgreSQL 
> lists and they *HIGHLY* recommend disabling the OOM killer.  The section 
> 16.4.3 for details:
> 
> http://www.postgresql.org/docs/current/static/kernel-resources.html#AEN18105
> 
> Basically their view of the OOM killer is that it's a very bad idea for 
> a server that you want to be reliable. Basically if you tell the Kernel 
> not to overcommit memory, the OOM killer becomes moot, but you better 
> have enough mem / swap space to handle your needs.

The OOMK isn't _causing_ the problem. The problem is already there! You're
simply out of memory! It doesn't matter if malloc() fails or not.

The OOMK is simply one way of fixing it.

``to avoid this problem is to run PostgreSQL on a machine where you can
be sure that other processes will not run the machine out of memory.''

This is the nugget of truth. If you disable overcommit, then malloc() or
mmap() can fail. If you use overcommit, then the process gets killed
someplace else.

Pg has to be resistant to failure anyway- it has to protect against the
power cord getting yanked.

OOMK or not- you run out of memory, and Pg stops doing it's job.

Daemons should be resistant to accidental death. Whether it be OOMK,
signal, or bug in the software.

Daemons cannot be resistant on their own: The only way to "make" a
program stay running is by init- and that's only because init (pid 1)
cannot be killed.

If you run postmaster from init (as I do) then if Pg dies, it gets
restarted. This can be because Pg has a bug in it, or a signal gets sent
to the wrong process group, the OOMK goes nuts, or any number of other
reasons.

YES: there will be a short time in which queries aren't answered, but
that's better than a LONG time where queries aren't answered, and the
OOMK has absolutely nothing to do with that.

... just whether you use init.d or inittab.

My point here is that when a process (like dbmail) "daemonizes" itself,
it's no longer possible to "keep it running", and instead, we have to
resort to watchdog-style scripts.

I personally, think watchdog-style scripts are stupid. You don't have
to- I'm not trying to convince YOU that they're bad or anything, as I
mentioned elsewhere on the thread, init.d can daemonize a program
WITHOUT the assistance of the program, but if the program DOES daemonize
on it's own, then it can no longer be kept running.

So you say: "turn off the OOMK so Pg stays running"
I say: "Turn off init.d so Pg stays running"

-- 
Internet Connection High Quality Web Hosting
http://www.internetconnection.net/

Reply via email to