The glibc is somewhat notorious for retaining released C Heap memory: calling 
free(3) returns memory to the glibc, and most libc variants will return at 
least a portion of it back to the Operating System, but the glibc often does 
not.

This depends on the granularity of the allocations and a number of other 
factors, but we found that many small allocations in particular may cause the 
process heap segment (hence RSS) to get bloaty. This can cause the VM to not 
recover from C-heap usage spikes.

The glibc offers an API, "malloc_trim", which can be used to cause the glibc to 
return free'd memory back to the Operating System.

This may cost performance, however, and therefore I hesitate to call 
malloc_trim automatically. That may be an idea for another day.

Instead of an automatic trim I propose to add a jcmd which allows to manually 
trigger a libc heap trim. Such a command would have two purposes:
- when analyzing cases of high memory footprint, it allows to distinguish 
"real" footprint, e.g. leaks, from a cases where the glibc just holds on to 
memory
- as a stop gap measure it allows to release pressure from a high footprint 
scenario.

Note that this command also helps with analyzing libc peaks which had nothing 
to do with the VM - e.g. peaks created by customer code which just happens to 
share the same process as the VM. Such memory does not even have to show up in 
NMT.

I propose to introduce this command for Linux only. Other OSes (apart maybe 
AIX) do not seem to have this problem, but Linux is arguably important enough 
in itself to justify a Linux specific jcmd.

If this finds agreement, I will file a CSR.

=========

This patch:

- introduces a new jcmd, "VM.trim_libc_heap", no arguments, which trims the 
glibc heap on glibc platforms.
- includes a (rather basic) test
- the command calls malloc_trim(3), and additionally prints out its effect 
(changes caused in virt size, rss and swap space)
- I refactored some code in os_linux.cpp to factor out scanning 
/proc/self/status to get kernel memory information.

=========

Example:

A programm causes a temporary peak in C-heap usage (in this case, triggered via 
Unsafe.allocateMemory), right away frees the memory again, so its not leaky. 
The peak in RSS was ~8G (even though the user allocation was way smaller - 
glibc has a lot of overhead). The effects of this peak linger even after 
returning that memory to the glibc:



thomas@starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
Resident Set Size: 8685896K (peak: 8685896K) (anon: 8648680K, file: 37216K, 
shmem: 0K)
                   ^^^^^^^^


We execute the new trim command via jcmd:


thomas@starfish:~$ jjjcmd AllocCHeap VM.trim_libc_heap
18770:
Attempting trim...
Done.
Virtual size before: 28849744k, after: 28849724k, (-20k)
RSS before: 8685896k, after: 920740k, (-7765156k)  <<<<
Swap before: 0k, after: 0k, (0k)


It prints out reduction in virtual size, rss and swap. The virtual size did not 
decrease since no mappings had been unmapped by the glibc. However, the process 
heap was shrunk heavily by the glibc, resulting in a large drop in RSS 
(8.5G->900M), freeing >7G of memory:


thomas@starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
Resident Set Size: 920740K (peak: 8686004K) (anon: 883460K, file: 37280K, 
shmem: 0K)
                   ^^^^^^^


When the VM is started with -Xlog:os, this is also logged:


[139,068s][info][os] malloc_trim:
[139,068s][info][os] Virtual size before: 28849744k, after: 28849724k, (-20k)
RSS before: 8685896k, after: 920740k, (-7765156k)
Swap before: 0k, after: 0k, (0k)

-------------

Commit messages:
 - start

Changes: https://git.openjdk.java.net/jdk/pull/4510/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4510&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8268893
  Stats: 237 lines in 6 files changed: 212 ins; 7 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4510.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4510/head:pull/4510

PR: https://git.openjdk.java.net/jdk/pull/4510

Reply via email to