On 2013-07-20T15:23:41 EEST, Chris Samuel wrote: > > Hi there, > > On Sat, 20 Jul 2013 02:53:52 AM Bjørn-Helge Mevik wrote: > >> With the recent changes in glibc in how virtual memory is allocated for >> threaded applications, limiting virtual memory usage for threaded >> applications is IMO not a good idea. (One example: our slurcltd has >> allocated 16.1 GiB virtual memory, but is only using 104 MiB resident.) > > Would you have a pointer to these changes please?
From a recent message by yours truly to a slurm-dev thread about slurmctld memory consumption: """ Yes, this is what we're seeing as well. 6.5 GiB VMEM, 376 MB RSS. The change was that as of glibc 2.10 a more scalable malloc() implementation is used. The new implementation creates up to 8 (2 on 32-bit) pools per core, each 64 MB in size. Thus in our case, where slurmctld runs on a machine with 12 cores, we have up to 12*8*64=6144 MB in those malloc pools. See http://udrepper.livejournal.com/20948.html """ I would go even further than Bjørn-Helge and claim that limiting virtual memory is, in general, the wrong thing to do. Address space is essentially free and doesn't impact other applications so IMHO the workload manager has no business limiting that. The glibc malloc() behavior being just one situation where trying to limit virtual memory goes wrong. There are other situations where allocating lots of virtual memory is common. E.g. garbage collected runtimes such as Java often allocate huge heaps to use as the garbage collection arena but only a small fraction of that is actually used. >> I would suggest looking at cgroups for limiting memory usage. > > Unfortunately cgroups doesn't limit usage (i.e. cause malloc() to fail should > it have reached its limit); if I understand it correctly it just invokes the > OOM killer on a candidate process within the cgroup once the limit is reached. > :-( Yes, that's my understanding as well. On the "positive" side, few applications can sensibly handle malloc() failures anyway. Often the best that can be done without heroic effort is to just print an error message to stderr and abort(), which is not terribly different from being killed by the OOM killer anyway.. There are a few efforts in the Linux kernel community to do something about this that roughly go in a couple slightly different directions: - Provide some notification to applications that "you're exceeding your memory limit, release some memory quickly or face the wrath of the OOM killer". See https://lwn.net/Articles/552789/ https://lwn.net/Articles/548180/ - Provide a mechanism for applications to mark memory ranges as "volatile", where the kernel can drop them if memory gets tight instead of going on an OOM killer spree. https://lwn.net/Articles/522135/ https://lwn.net/Articles/554098/ That being said, AFAIK nothing of the above yet exists in the upstream kernel today. So for now IMHO the least bad approach is to just limit RSS as slurm already does (either with cgroups or by polling), and killing jobs if the limit is exceeded. -- Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist Aalto University School of Science, PHYS & BECS +358503841576 || janne.blomqv...@aalto.fi