[ https://issues.apache.org/jira/browse/CASSANDRA-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ariel Weisberg updated CASSANDRA-12699: --------------------------------------- Attachment: cassandraMemoryLog.sh > Excessive use of "hidden" Linux page table memory > ------------------------------------------------- > > Key: CASSANDRA-12699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12699 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.2.7 on Red Hat 6.7, with Java 1.8.0_73. > Probably others. > Reporter: Heiko Sommer > Attachments: PageTableMemoryExample.png, cassandra-env.sh, > cassandra.yaml, cassandraMemoryLog.sh, cassandraMemoryLog.sh > > > The cassandra JVM process uses many gigabytes of page table memory during > certain activities, which can lead to oom-killer action with > "java.lang.OutOfMemoryError: null" logs. > Page table memory is not reported by Linux tools such as "top" or "ps" and > therefore might be responsible also for other spurious Cassandra issues with > "memory eating" or crashes, e.g. CASSANDRA-8723. > The problem happens especially (or only?) during large compactions and > anticompactions. > Eventually all memory gets released, which means there is no real leak. Still > I suspect that the memory mappings that fill the page table could be released > much sooner, to keep the page table size at a small fraction of the total > Cassandra process memory. > How to reproduce: Record the memory use on a Cassandra node, including page > table memory, for example using the attached script cassandraMemoryLog.sh. > Even when there is no crash, the ramping up and sudden release of page table > memory is visible. > A stacked area plot for the memory on one of our crashed nodes is attached > (PageTableMemoryExample.png). The page table memory used by Cassandra is > shown in red ("VmPTE"). > (In the plot we also see that the sum of measured memory portions sometimes > exceeds the total memory. This is probably an issue of how RSS memory is > measured, perhaps including some buffers/cache memory that also counts toward > available memory. It does not invalidate the finding that page table memory > is growing to enormous sizes.) > Shortly before the crash, /proc/$PID/status reported > VmPeak: 6989760944 kB > VmSize: 5742400572 kB > VmLck: 4735036 kB > VmHWM: 8589972 kB > VmRSS: 7022036 kB > VmData: 10019732 kB > VmStk: 92 kB > VmExe: 4 kB > VmLib: 17584 kB > VmPTE: 3965856 kB > VmSwap: 0 kB > The files cassandra.yaml and cassandra-env.sh used on the node where the data > was taken are attached. > Please let me know if I should provide any other data or descriptions to help > with this ticket. > Known workarounds: Use more RAM, or limit the amount of Java heap memory. In > the above crash, MAX_HEAP_SIZE was not set, so that the default heap size for > 12 GB RAM was used (-Xms2976M, -Xmx2976M). > We have not tried yet if variations of heap vs. offheap config choices make a > difference. > Perhaps there are other workarounds using -XX+UseLargePages or related Linux > settings to reduce the size of the process page table? > I believe that we see these crashes more often than other projects because we > have a test system with not much RAM but with a lot of data (compressed ~3 TB > per node), while the CPUs are slow so that anti-/compactions overlap a lot. > Ideally Cassandra (native) code should be changed to release memory in > smaller chunks, so that page table size cannot cause an otherwise stable > system to crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)