Ok - thanks but maybe some kernel guy can help or point to some good resource 
to get educated because I don'r really get it.

The following is from our other small log cluster with 2 nodes with 8GM ram 
cassandra has 4GB max heap

- We have disabled swap on all cassandra servers
- On the machine were I got the system oom (not java oom) I look at dmesg and 
see

In the normal zone:

free:6604kB show we hava a problem
unevictable:4176332kB because we have jna

and page cache info (DMA32 zone):

active_anon:3118624kB 
active_file:0kB

If I understand it correctly active_anon and _file correspond to file backed 
and non file backed pages. 

Question 1:
So if the res memory thats non-heap is actually mmap'd files shouldn't they 
show up in active_file?

- I compare /proc/meminfo's of the node that was restarted and the other one 
that still survived

Active(anon) on the restarted server is ~100MB on the other its > 1GB.

Question 2:
Could anyone point me to some resource which explains how the file system cache 
usage of the page cache and process usage of page cache are orchestrated?
I understand that the swap daemon only checks the page cache if the number of 
free pages is getting low. So if res memory is used up by the java process 
(which is not controllable by -Xmx settings) it seems to compete with the file 
system cache. Wow is memory usage optimized so that the right parts of file are 
in mem?

Thanks,
Daniel


snip from: dmesg

[7226178.039658] Node 0 DMA32 free:23420kB min:4816kB low:6020kB high:7224kB 
active_anon:3118624kB inactive_anon:20048kB active_file:0kB inactive_file:764kB 
unevictable:268156kB isolated(anon):0kB isolated(file):0kB present:3463804kB 
mlocked:268156kB dirty:0kB writeback:32kB mapped:1944kB shmem:120kB 
slab_reclaimable:1124kB slab_unreclaimable:1740kB kernel_stack:1368kB 
pagetables:9636kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 
all_unreclaimable? yes
[7226178.039672] Node 0 Normal free:6604kB min:6652kB low:8312kB high:9976kB 
active_anon:412072kB inactive_anon:68800kB active_file:100kB 
inactive_file:340kB unevictable:4176332kB isolated(anon):0kB isolated(file):0kB 
present:4783356kB mlocked:4176332kB dirty:0kB writeback:28kB mapped:13600kB 
shmem:1376kB slab_reclaimable:5052kB slab_unreclaimable:12172kB 
kernel_stack:1400kB pagetables:11156kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:747 all_unreclaimable? yes
[7226178.039682] lowmem_reserve[]: 0 0 0 0
[7226178.039686] Node 0 DMA: 2*4kB 2*8kB 1*16kB 1*32kB 2*64kB 0*128kB 1*256kB 
0*512kB 1*1024kB 1*2048kB 3*4096kB = 15816kB
[7226178.039696] Node 0 DMA32: 214*4kB 166*8kB 165*16kB 77*32kB 42*64kB 
30*128kB 14*256kB 0*512kB 0*1024kB 1*2048kB 1*4096kB = 23544kB
[7226178.039705] Node 0 Normal: 731*4kB 3*8kB 0*16kB 2*32kB 2*64kB 1*128kB 
2*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 6852kB
[7226178.039714] 4707 total pagecache pages
[7226178.039715] 0 pages in swap cache
[7226178.039717] Swap cache stats: add 0, delete 0, find 0/0
[7226178.039719] Free swap  = 0kB
[7226178.039720] Total swap = 0kB
[7226178.064611] 2097135 pages RAM
[7226178.064614] 47927 pages reserved
[7226178.064615] 14100 pages shared
[7226178.064616] 2031896 pages non-shared
[7226178.064620] Out of memory: kill process 11670 (java) score 13723 or a child


On Jul 4, 2011, at 2:42 PM, Jonathan Ellis wrote:

> mmap'd data will be attributed to res, but the OS can page it out
> instead of killing the process.
> 
> On Mon, Jul 4, 2011 at 5:52 AM, Daniel Doubleday
> <daniel.double...@gmx.net> wrote:
>> Hi all,
>> we have a mem problem with cassandra. res goes up without bounds (well until
>> the os kills the process because we dont have swap)
>> I found a thread that's about the same problem but on OpenJDK:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
>> We are on Debian with Sun JDK.
>> Resident mem is 7.4G while heap is restricted to 3G.
>> Anyone else is seeing this with Sun JDK?
>> Cheers,
>> Daniel
>> :/home/dd# java -version
>> java version "1.6.0_24"
>> Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
>> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
>> :/home/dd# ps aux |grep java
>> cass     28201  9.5 46.8 372659544 7707172 ?   SLl  May24 5656:21
>> /usr/bin/java -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42
>> -Xms3000M -Xmx3000M -Xmn400M ...
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 
>> 
>> 28201 cass      20   0  355g 7.4g 1.4g S    8 46.9   5656:25 java
>> 
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com

Reply via email to