On Fri, 11 Sep 2009, Tracy Reed wrote:

> On Fri, Sep 11, 2009 at 05:55:59PM -0400, [email protected] spake thusly:
>> Ha! What happens if something else gets killed instead (sshd? iptables?
>> syslogd?)...then things get really ugly...not only is apache running badly,
>
> Important stuff is monitored and restarted by configuration
> management. So far I haven't had it be a problem.

Hi tracy.  Good luck with that.  The OOM killer could easily hit part of 
your management system (even if it only uses sshd to allow a remote system 
to login and restart the app).

Figuring out how the OOM killer should decide what to whack is a much 
bigger problem than most people realise.  Have a look at the discussions 
on LKML that span years.  The bottom line is that for a general purpose 
server there are no really good solutions to this problem.

The best way to manage an OOM condition is to avoid it in the first 
place[1].

If the OOM killer activates all bets are off about the stability of the 
system.  Security might even be impacted.   Even if the OOM killer 
algorithm tries to spare important processes (Linux does) there is no way 
of knowing how bad the OOM is or how long it will last.  No process is 
immune.

Having the box reboot on OOM isn't necessarily a bad idea in some 
situations but what if the OOM condition hits all the servers in the farm 
at the same time[2]?

[1] At this point I'm thinking about a quote from the movie Wargames ;)

[2] Not as unrealistic as it might seem at first.  The OOM might be a by 
product of external stimuli like an attempted DoS against the servers. 
The servers might help the attackers by DoSing themselves.

Cheers,

Rob

-- 
I tried to change the world but they had a no-return policy
http://www.practicalsysadmin.com
_______________________________________________
Discuss mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to