On 4/12/19 10:06 PM, Roman Gushchin wrote:
> On Fri, Apr 12, 2019 at 03:14:18PM -0400, Johannes Weiner wrote:
>> With the default overcommit==guess we occasionally run into mmap
>> rejections despite plenty of memory that would get dropped under
>> pressure but just isn't accounted reclaimable. One example of this is
>> dying cgroups pinned by some page cache. A previous case was auxiliary
>> path name memory associated with dentries; we have since annotated
>> those allocations to avoid overcommit failures (see d79f7aa496fc ("mm:
>> treat indirectly reclaimable memory as free in overcommit logic")).
>>
>> But trying to classify all allocated memory reliably as reclaimable
>> and unreclaimable is a bit of a fool's errand. There could be a myriad
>> of dependencies that constantly change with kernel versions.Just wondering, did you find at least one another reclaimable case like those path names? >> It becomes even more questionable of an effort when considering how >> this estimate of available memory is used: it's not compared to the >> system-wide allocated virtual memory in any way. It's not even >> compared to the allocating process's address space. It's compared to >> the single allocation request at hand! >> >> So we have an elaborate left-hand side of the equation that tries to >> assess the exact breathing room the system has available down to a >> page - and then compare it to an isolated allocation request with no >> additional context. We could fail an allocation of N bytes, but for >> two allocations of N/2 bytes we'd do this elaborate dance twice in a >> row and then still let N bytes of virtual memory through. This doesn't >> make a whole lot of sense. >> >> Let's take a step back and look at the actual goal of the >> heuristic. From the documentation: >> >> Heuristic overcommit handling. Obvious overcommits of address >> space are refused. Used for a typical system. It ensures a >> seriously wild allocation fails while allowing overcommit to >> reduce swap usage. root is allowed to allocate slightly more >> memory in this mode. This is the default. >> >> If all we want to do is catch clearly bogus allocation requests >> irrespective of the general virtual memory situation, the physical >> memory counter-part doesn't need to be that complicated, either. >> >> When in GUESS mode, catch wild allocations by comparing their request >> size to total amount of ram and swap in the system. >> >> Signed-off-by: Johannes Weiner <[email protected]> > > My 2c here: any kinds of percpu counters and percpu data is accounted > as unreclaimable and can alter the calculation significantly. > > This is a special problem on hosts, which were idle for some time. > Without any memory pressure, kernel caches do occupy most of the memory, > so than a following attempt to start a workload fails. So then we remove the kmalloc-reclaimable caches again as not worth the trouble anymore (they might be useful for anti-fragmentation purposes, but that's much harder to quantify), or what? > With a big pleasure: > Acked-by: Roman Gushchin <[email protected]> > > Thanks! >

