Digging more into the problem, I've found that 91% of heap is taken by: 1,432 instances of *"com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey"*, loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"* occupy *121,257,480 (91.26%)* bytes. These instances are referenced from one instance of *"com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]"*, loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"*
*Keywords* sun.misc.Launcher$AppClassLoader @ 0xef589a90 com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey com.gemstone.gemfire.internal.cache.ProxyBucketRegion[] 1,432 instances doesn't sound like a lot, but looks like those are big instances, about 121k each. Maybe something wrong with my configuration, and I can limit creating such instances? Thanks, Eugene On Mon, Apr 25, 2016 at 4:19 PM, Jens Deppe <[email protected]> wrote: > I think you're looking at the wrong info in ps. > > What you're showing is the Virtual size (vsz) of memory. This is how much > the process has requested, but that does not mean it is actually using it. > In fact, your output says that Java has reserved 3Gb of memory, not 300Mb! > You should instead look at the Resident Set Size (rss option) as that will > give you a much more accurate picture of what is actually using real memory. > > Also, remember that the JVM also needs memory for loaded code (jars and > classes), JITed code, thread stacks, etc. so when setting your heap size > you should take that into account too. > > Finally, especially on virtualized hardware and doubly so on small > configs, make sure you *never, ever* end up swapping because that will > really kill your performance. > > --Jens > > On Mon, Apr 25, 2016 at 12:32 PM, Anilkumar Gingade <[email protected]> > wrote: > >> >> It joined the cluster, and loaded data from overflow files. >> Not sure if this makes the OS file-system (disk buffer/cache) to consume >> memory... >> When you say overflow, I am assuming you are initializing the >> data/regions using persistence files, if so can you try without the >> persistence... >> >> -Anil. >> >> >> >> >> >> >> On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin <[email protected]> >> wrote: >> >>> And when I'm checking memory usage per process, it looks normal, java >>> took only 300Mb as it supposed to, but free -m still shows no memory: >>> >>> # ps axo pid,vsz,comm=|sort -n -k 2 >>> PID VSZ >>> 465 26396 systemd-logind >>> 444 26724 dbus-daemon >>> 454 27984 avahi-daemon >>> 443 28108 avahi-daemon >>> 344 32720 systemd-journal >>> 1 41212 systemd >>> 364 43132 systemd-udevd >>> 27138 52688 sftp-server >>> 511 53056 wpa_supplicant >>> 769 82548 sshd >>> 30734 83972 sshd >>> 1068 91128 master >>> 28534 91232 pickup >>> 1073 91300 qmgr >>> 519 110032 agetty >>> 27029 115380 bash >>> 27145 115380 bash >>> 30736 116440 sort >>> 385 116720 auditd >>> 489 126332 crond >>> 30733 139624 sshd >>> 27027 140840 sshd >>> 27136 140840 sshd >>> 27143 140840 sshd >>> 30735 148904 ps >>> 438 242360 rsyslogd >>> 466 447932 NetworkManager >>> 510 527448 polkitd >>> 770 553060 tuned >>> 30074 2922460 java >>> >>> # free -m >>> total used free shared buff/cache >>> available >>> Mem: 489 424 5 0 58 >>> 41 >>> Swap: 255 57 198 >>> >>> >>> On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin <[email protected]> >>> wrote: >>> >>>> thanks for your help, but I still struggling with the System OOMKiller >>>> issue. >>>> I was doing more digging. And still couldn't find the problem. >>>> All settings are normal overcommit_memory=0, overcommit_ratio=50. >>>> free -m before the process starts: >>>> >>>> # free -m >>>> total used free shared buff/cache >>>> available >>>> Mem: 489 25 399 1 63 >>>> 440 >>>> Swap: 255 57 198 >>>> >>>> I start my process like this: >>>> >>>> *java* -server -Xmx300m -Xms300m -XX:+HeapDumpOnOutOfMemoryError >>>> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=55 -jar >>>> /opt/ccio-image.jar >>>> >>>> So, I should still have about 99Mb of free memory, but: >>>> >>>> # free -m >>>> total used free shared buff/cache >>>> available >>>> Mem: 489 409 6 1 73 >>>> 55 >>>> Swap: 255 54 201 >>>> >>>> And I didn't even make a single call to the process yet. It joined the >>>> cluster, and loaded data from overflow files. And all my free memory is >>>> gone. Even though I've set 300Mb max for Java. >>>> As I mentioned before, I've set off-heap memory setting to false: >>>> >>>> Cache cache = new CacheFactory() >>>> .set("locators", LOCATORS.get()) >>>> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]") >>>> .set("bind-address", LOCATOR_IP.get()) >>>> .create(); >>>> >>>> cache.createDiskStoreFactory() >>>> .setMaxOplogSize(500) >>>> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") }, >>>> new int[] { 18000 }) >>>> .setCompactionThreshold(95) >>>> .create("-ccio-store"); >>>> >>>> RegionFactory<String, byte[]> regionFactory = >>>> cache.createRegionFactory(); >>>> >>>> Region<String, byte[]> region = regionFactory >>>> .setDiskStoreName("-ccio-store") >>>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION) >>>> .setOffHeap(false) >>>> .setMulticastEnabled(false) >>>> .setCacheLoader(new AwsS3CacheLoader()) >>>> .create("ccio-images"); >>>> >>>> I don't understand how the memory is getting overcommitted. >>>> >>>> Eugene >>>> >>>> On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby <[email protected]> >>>> wrote: >>>> >>>>> The OOM killer uses the overcommit_memory and overcommit_ratio >>>>> parameters to determine if / when to kill a process. >>>>> >>>>> What are the settings for these parameters in your environment? >>>>> >>>>> The defaults are 0 and 50. >>>>> >>>>> cat /proc/sys/vm/overcommit_memory >>>>> 0 >>>>> >>>>> cat /proc/sys/vm/overcommit_ratio >>>>> 50 >>>>> >>>>> How much free memory is available before you start the JVM? >>>>> >>>>> How much free memory is available when your process is killed? >>>>> >>>>> You can monitor free memory using either free or vmstat before and >>>>> during your test. >>>>> >>>>> Run free -m in a loop to monitor free memory like: >>>>> >>>>> free -ms2 >>>>> total used free shared buffers >>>>> cached >>>>> Mem: 290639 35021 255617 0 9215 >>>>> 21396 >>>>> -/+ buffers/cache: 4408 286230 >>>>> Swap: 20473 0 20473 >>>>> >>>>> Run vmstat in a loop to monitor memory like: >>>>> >>>>> vmstat -SM 2 >>>>> procs -----------memory---------- ---swap-- -----io---- --system-- >>>>> -----cpu----- >>>>> r b swpd free buff cache si so bi bo in cs us >>>>> sy id wa st >>>>> 0 0 0 255619 9215 21396 0 0 0 23 0 0 2 >>>>> 0 98 0 0 >>>>> 0 0 0 255619 9215 21396 0 0 0 0 121 198 0 >>>>> 0 100 0 0 >>>>> 0 0 0 255619 9215 21396 0 0 0 0 102 189 0 >>>>> 0 100 0 0 >>>>> 0 0 0 255619 9215 21396 0 0 0 0 110 195 0 >>>>> 0 100 0 0 >>>>> 0 0 0 255619 9215 21396 0 0 0 0 117 205 0 >>>>> 0 100 0 0 >>>>> >>>>> >>>>> Thanks, >>>>> Barry Oglesby >>>>> >>>>> >>>>> On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith <[email protected]> wrote: >>>>> >>>>>> The java metaspace will also take up memory. Maybe try setting >>>>>> -XX:MaxMetaspaceSize >>>>>> >>>>>> -Dan >>>>>> >>>>>> >>>>>> -------- Original message -------- >>>>>> From: Eugene Strokin <[email protected]> >>>>>> Date: 4/22/2016 4:34 PM (GMT-08:00) >>>>>> To: [email protected] >>>>>> Subject: Re: System Out of Memory >>>>>> >>>>>> The machine is small, it has only 512mb RAM, plus 256mb swap. >>>>>> But java is set max heap size to 400mb. I've tried less, no help. And >>>>>> the most interesting part is that I don't see Java OOM Exceptions at >>>>>> all. I >>>>>> even included a code with memory leak, and I saw the Java OOM Exceptions >>>>>> before the java process got killed then. >>>>>> I've browsed internet, and some people are actually noticed the same >>>>>> problem with other frameworks, not Geode. So, I'm suspecting this could >>>>>> be >>>>>> not Geode, but Geode was the first suspect because it has off-heap >>>>>> storage >>>>>> feature. They say that there was a memory leak, but for some reason OS >>>>>> was >>>>>> killing the process even before Java was getting OOM, >>>>>> I'll connect with JProbe, and will be monitoring the system with the >>>>>> console. Will let you know if I'll find something interesting. >>>>>> >>>>>> Thanks, >>>>>> Eugene >>>>>> >>>>>> >>>>>> On Fri, Apr 22, 2016 at 5:55 PM, Dan Smith <[email protected]> wrote: >>>>>> >>>>>>> What's your -Xmx for your JVM set to, and how much memory does your >>>>>>> droplet have? Does it have any swap space? My guess is you need to >>>>>>> reduce the heap size of your JVM and the OS is killing your process >>>>>>> because there is not enough memory left. >>>>>>> >>>>>>> -Dan >>>>>>> >>>>>>> On Fri, Apr 22, 2016 at 1:55 PM, Darrel Schneider < >>>>>>> [email protected]> wrote: >>>>>>> > I don't know why your OS would be killing your process which seems >>>>>>> like your >>>>>>> > main problem. >>>>>>> > >>>>>>> > But I did want you to know that if you don't have any regions with >>>>>>> > off-heap=true then you have no reason to have off-heap-memory-size >>>>>>> to be set >>>>>>> > to anything other than 0. >>>>>>> > >>>>>>> > On Fri, Apr 22, 2016 at 12:48 PM, Eugene Strokin < >>>>>>> [email protected]> >>>>>>> > wrote: >>>>>>> >> >>>>>>> >> I'm running load tests on the Geode cluster I've built. >>>>>>> >> The OS is killing my process occasionally, complaining that the >>>>>>> process >>>>>>> >> takes too much memory: >>>>>>> >> >>>>>>> >> # dmesg >>>>>>> >> [ 2544.932226] Out of memory: Kill process 5382 (java) score 780 >>>>>>> or >>>>>>> >> sacrifice child >>>>>>> >> [ 2544.933591] Killed process 5382 (java) total-vm:3102804kB, >>>>>>> >> anon-rss:335780kB, file-rss:0kB >>>>>>> >> >>>>>>> >> Java doesn't have any problems, I don't see OOM exception. >>>>>>> >> Looks like Geode is using off-heap memory. But I set offHeap to >>>>>>> false for >>>>>>> >> my region, and I do have only one region: >>>>>>> >> >>>>>>> >> RegionFactory<String, byte[]> regionFactory = >>>>>>> cache.createRegionFactory(); >>>>>>> >> regionFactory >>>>>>> >> .setDiskStoreName("-ccio-store") >>>>>>> >> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION) >>>>>>> >> .setOffHeap(false) >>>>>>> >> .setCacheLoader(new AwsS3CacheLoader()); >>>>>>> >> >>>>>>> >> Also, I've played with off-heap-memory-size setting, setting it >>>>>>> to small >>>>>>> >> number like 20M to prevent Geode to take too much off-heap >>>>>>> memory, but >>>>>>> >> result is the same. >>>>>>> >> >>>>>>> >> Do you have any other ideas what could I do here? I'm stack at >>>>>>> this point. >>>>>>> >> >>>>>>> >> Thank you, >>>>>>> >> Eugene >>>>>>> > >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >
