Jeff,  this is very good advice.

I have had many, many hours of deep joy getting to know the OOM killer
and all of his wily ways.
Respect the OOM Killer!

On cluster I manage, the OOM killer is working, however there is a
strict policy that if OOM killer kicks on in a cluster node it is
excluded from the batch system and rebooted.
As you say, you can't tell what processes it goes off to kill.

However, there is a very sueful sysctl setting for OOM:

vm.oom_kill_allocating_task     Set this to 1 and the system kills the
task which triggered the OOM, rather than doing a scan of system
processes.
I find that an an HPC environment this will kill the executable which
is using too much memory.

http://www.linuxinsight.com/proc_sys_vm_oom_kill_allocating_task.html

Reply via email to