Hi Austin,

On 05/13/2016 03:51 PM, Austin S. Hemmelgarn wrote:
> On 2016-05-13 09:32, Sebastian Frias wrote:
>> I didn't see that in Documentation/vm/overcommit-accounting or am I looking 
>> in the wrong place?
> It's controlled by a sysctl value, so it's listed in 
> Documentation/sysctl/vm.txt
> The relevant sysctl is vm.oom_kill_allocating_task

Thanks, I just read that.
Does not look like a replacement for overcommit=never though.

>>
>>>>
>>>> Well, it's hard to report, since it is essentially the result of a dynamic 
>>>> system.
>>>> I could assume it killed terminals with a long history buffer, or editors 
>>>> with many buffers (or big buffers).
>>>> Actually when it happened, I just turned overcommit off. I just checked 
>>>> and is on again on my desktop, probably forgot to make it a permanent 
>>>> setting.
>>>>
>>>> In the end, no processes is a good candidate for termination.
>>>> What works for you may not work for me, that's the whole point, there's a 
>>>> heuristic (which conceptually can never be perfect), yet the mere fact 
>>>> that some process has to be killed is somewhat chilling.
>>>> I mean, all running processes are supposedly there and running for a 
>>>> reason.
>>> OTOH, just because something is there for a reason doesn't mean it's doing 
>>> what it's supposed to be.  Bugs happen, including memory leaks, and if 
>>> something is misbehaving enough that it impacts the rest of the system, it 
>>> really should be dealt with.
>>
>> Exactly, it's just that in this case, the system is deciding how to deal 
>> with the situation by itself.
> On a busy server where uptime is critical, you can't wait for someone to 
> notice and handle it manually, you need the issue resolved ASAP.  Now, this 
> won't always kill the correct thing, but if it's due to a memory leak, it 
> often will work like it should.

The keyword is "'often' will work as expected".
So you are saying that it will kill a program leaking memory in what, like 90% 
of the cases?
I'm not sure if I would setup a server with critical uptime to have the 
OOM-killer enabled, do you think that'd be a good idea?

Anyway, as a side note, I just want to say thank you guys for having this 
discussion.
I think it is an interesting thread and hopefully it will advance the 
"knowledge" about this setting.

>>
>>>
>>> This brings to mind a complex bug involving Tor and GCC whereby building 
>>> certain (old) versions of Tor with certain (old) versions of GCC with -Os 
>>> would cause an infinite loop in GCC.  You obviously have GCC running for a 
>>> reason, but that doesn't mean that it's doing what it should be.
>>
>> I'm not sure if I followed the analogy/example, but are you saying that the 
>> OOM-killer killed GCC in your example?
>> This seems an odd example though, I mean, shouldn't the guy in front of the 
>> computer notice the loop and kill GCC by himself?
> No, I didn't mean as an example of the OOM killer, I just meant as an example 
> of software not doing what it should.  It's not as easy to find an example 
> for the OOM killer, so I don't really have a good example. The general 
> concept is the same though, the only difference is there isn't a kernel 
> protection against infinite loops (because they aren't always bugs, while 
> memory leaks and similar are).

So how does the kernel knows that a process is "leaking memory" as opposed to 
just "using lots of memory"? (wouldn't that be comparable to answering how does 
the kernel knows the difference between an infinite loop and one that is not?)

Best regards,

Sebastian

Reply via email to