Charles-François Natali added the comment:
> @Charles-François: I think your worries about calloc and overcommit are
> unjustified. First, calloc and malloc+memset actually behave the same way
> here -- with a large allocation and overcommit enabled, malloc and calloc
> will both go ahead and return the large allocation, and then the actual
> out-of-memory (OOM) event won't occur until the memory is accessed. In the
> malloc+memset case this access will occur immediately after the malloc,
> during the memset -- but this is still too late for us to detect the malloc
> failure.
Not really: what you describe only holds for a single object.
But if you allocate let's say 1000 such objects at once:
- in the malloc + memset case, the committed pages are progressively
accessed (i.e. the pages for object N are accessed before the memory
is allocated for object N+1), so they will be counted not only as
committed, but also as active (for example the RSS will increase
gradually): so at some point, even though by default the Linux VM
subsystem is really lenient toward overcommitting, you'll likely have
malloc/mmap return NULL because of this
- in the calloc() case, all the memory is first committed, but not
touched: the kernel will likely happily overcommit all of this. Only
when you start progressively accessing the pages will the OOM kick in.
> Second, OOM does not cause segfaults on any system I know. On Linux it wakes
> up the OOM killer, which shoots some random (possibly guilty) process in the
> head. The actual program which triggered the OOM is quite likely to escape
> unscathed.
Ah, did I say segfault?
Sorry, I of course meant that the process will get nuked by the OOM killer.
> In practice, the *only* cases where you can get a MemoryError on modern
> systems are (a) if the user has turned overcommit off, (b) you're on a tiny
> embedded system that doesn't have overcommit, (c) if you run out of virtual
> address space. None of these cases are affected by the differences between
> malloc and calloc.
That's a common misconception: provided that the memory allocated is
accessed progressively (see above point), you'll often get ENOMEM,
even with overcommitting:
$ /sbin/sysctl -a | grep overcommit
vm.nr_overcommit_hugepages = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
$ cat /tmp/test.py
l = []
with open('/proc/self/status') as f:
try:
for i in range(50000000):
l.append(i)
except MemoryError:
for line in f:
if 'VmPeak' in line:
print(line)
raise
$ python /tmp/test.py
VmPeak: 720460 kB
Traceback (most recent call last):
File "/tmp/test.py", line 7, in <module>
l.append(i)
MemoryError
I have a 32-bit machine, but the process definitely has more than
720MB of address space ;-)
If your statement were true, this would mean that it's almost
impossible to get ENOMEM with overcommitting on a 64-bit machine,
which is - fortunately - not true. Just try python -c "[i for i in
range(<large value>)]" on a 64-bit machine, I'll bet you'll get a
MemoryError (ENOMEM).
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue21233>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com