Re: System Out of Memory

Darrel Schneider Tue, 26 Apr 2016 15:03:13 -0700

One of the things to consider doing is using a key that can be stored
"inline".
The docs link Barry sent you has a section titled "
Using Key Storage Optimization". Your keys do not appear to be optimized
since I saw that the entry class had "ObjectKey" in its name. But if you
can shorten the String or change to a Integer, Long, or UUID then the
memory overhead ends up being just the region entry object itself.



On Tue, Apr 26, 2016 at 2:51 PM, Barry Oglesby <[email protected]> wrote:

> With partitioned regions, the keys are not on all the members. Instead,
> they are spread among the members. With no redundancy, there will be
> exactly 8m keys spread among the members. With redundant-copies=1, there
> will be 16m keys spread among the members (1 primary and one secondary copy
> of each key).
>
> Thanks,
> Barry Oglesby
>
>
> On Tue, Apr 26, 2016 at 2:08 PM, Eugene Strokin <[email protected]>
> wrote:
>
>> Guys, thanks a lot, it was a great help.
>> I'll try to use the settings you've suggested, but it looks like I
>> misunderstood how Geode works.
>> Let me show my way of thoughts, and please correct me where I was wrong,
>> because it looks like I have a design flow right from beginning:
>>
>> Original idea: cache files in the Geode Cluster. Basically String/byte[]
>> cache.
>> - Each Cluster node is a very small machine - 20Gb Hard Drive. 512Mb RAM,
>> Initially I'm planning to have 50 nodes.
>> - I do have about 16Gb for the cache on the disk, and from my tests I see
>> that I have about 300Mb in heap.
>> - Average file size is about 100kb, so on one node I could cache about
>> 160,000 files
>> - Average key size is less than 100 bytes, so I should need about 16Mb of
>> RAM for keys, plus some space for references.
>> - In 50 nodes I should be able to cache about 8 millions files.
>>
>> But it looks to me that all 8M keys should be in heap on each node, so my
>> 512Mb would not be enough even for the keys. So there is a limit in
>> horizontal scale defined by the RAM size of each node. In my case, it looks
>> like I cannot have more than 20-25 nodes, roughly. Adding more nodes would
>> result OOM Exception.
>>
>> Am I correct? Or am I wrong assuming that each node must keep all keys of
>> the cluster?
>>
>>
>>
>>
>>
>> On Tue, Apr 26, 2016 at 4:38 PM, Barry Oglesby <[email protected]>
>> wrote:
>>
>>> As others have suggested, you should configure one of the overflow
>>> algorithms on your region. I guess I would recommend the
>>> PARTITION_REDUNDANT_PERSISTENT_OVERFLOW region shortcut so that region
>>> eviction is tied to the overall JVM heap, but the configuration Anthony
>>> recommended works nicely too. It just depends on your use case.
>>>
>>> Another thing you can do is to lazily load the values from disk upon
>>> recovery.
>>>
>>> If you set gemfire.disk.recoverValues=false, then only the keys are
>>> reloaded. The values will remain on disk until you explicitly get them.
>>>
>>> In my test with 5000 120k objects, I see this difference:
>>>
>>> gemfire.disk.recoverValues=true (default):
>>> Total        296579      614162824
>>>
>>> gemfire.disk.recoverValues=false:
>>> Total        286899       14375896
>>>
>>> Of course, if you end up invoking get on all your keys, you're JVM will
>>> still end up in the same state its in now.
>>>
>>>
>>> Thanks,
>>> Barry Oglesby
>>>
>>>
>>> On Tue, Apr 26, 2016 at 12:57 PM, John Blum <[email protected]> wrote:
>>>
>>>> No, even if you set a Disk Store, it does not even necessarily mean
>>>> your Region is persistent!  Check out GemFire's different 
>>>> RegionShortcuts...
>>>>
>>>>
>>>> http://data-docs-samples.cfapps.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/RegionShortcut.html
>>>>
>>>> The affects of which (in terms of DataPolicy, Overflow, Eviction,
>>>> Expiration) can be seen here...
>>>>
>>>>
>>>> https://github.com/apache/incubator-geode/blob/rel/v1.0.0-incubating.M2/geode-core/src/main/java/com/gemstone/gemfire/internal/cache/GemFireCacheImpl.java#L4814-5019
>>>>
>>>> You can, of course, specify a RegionShortcut (instead of DataPolicy)
>>>> when constructing a Region using the factory...
>>>>
>>>>
>>>> http://data-docs-samples.cfapps.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/Cache.html#createRegionFactory(com.gemstone.gemfire.cache.RegionShortcut)
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Apr 26, 2016 at 12:46 PM, Eugene Strokin <[email protected]>
>>>> wrote:
>>>>
>>>>> Right, this is the region I'm still using. And the disk store looks
>>>>> like this:
>>>>>
>>>>> Cache cache = new CacheFactory()
>>>>> .set("locators", LOCATORS.get())
>>>>> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
>>>>> .set("bind-address", LOCATOR_IP.get())
>>>>> .create();
>>>>>
>>>>> cache.createDiskStoreFactory()
>>>>> .setMaxOplogSize(500)
>>>>> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") },
>>>>> new int[] { 18000 })
>>>>> .setCompactionThreshold(95)
>>>>> .create("-ccio-store");
>>>>>
>>>>> RegionFactory<String, byte[]> regionFactory =
>>>>> cache.createRegionFactory();
>>>>> Region<String, byte[]> region = regionFactory
>>>>> .setDiskStoreName("-ccio-store")
>>>>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>> .setOffHeap(false)
>>>>> .setMulticastEnabled(false)
>>>>> .setCacheLoader(new AwsS3CacheLoader())
>>>>> .create("ccio-images");
>>>>>
>>>>> I thought, since I have disk store specified the overflow if set.
>>>>> Please correct me if I'm wrong.
>>>>>
>>>>> Thank you,
>>>>> Eugene
>>>>>
>>>>> On Tue, Apr 26, 2016 at 3:40 PM, Udo Kohlmeyer <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi there Eugene,
>>>>>>
>>>>>> Geode will try and keep as much data in memory as it can, depending
>>>>>> on LRU eviction strategy. Once data is overflowed to disk, the memory for
>>>>>> the "value" would be freed up once GC has run.
>>>>>>
>>>>>> Is this still the correct region configuration you are using?
>>>>>>
>>>>>>  Region<String, byte[]> region = regionFactory
>>>>>> .setDiskStoreName("-ccio-store")
>>>>>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>>> .setOffHeap(false)
>>>>>> .setMulticastEnabled(false)
>>>>>> .setCacheLoader(new AwsS3CacheLoader())
>>>>>> .create("ccio-images");
>>>>>>
>>>>>> If not could you please provide your current config you are testing
>>>>>> with? Because this config does not enable overflow.
>>>>>>
>>>>>>
>>>>>> <http://geode.docs.pivotal.io/docs/reference/topics/memory_requirements_guidelines_and_calc.html#topic_ac4_mtz_j4>
>>>>>> --Udo
>>>>>>
>>>>>>
>>>>>> On 27/04/2016 4:51 am, Eugene Strokin wrote:
>>>>>>
>>>>>> Right, I do have 1432 objects in my cache. But I thought, only the
>>>>>> keys will be in the memory, but the actual data would still be on the 
>>>>>> disk,
>>>>>> and when a client would try to get it, the data would be retrieved from 
>>>>>> the
>>>>>> storage.
>>>>>> I'm expecting to keep millions records in the cache, but I don't have
>>>>>> memory to keep all of them in there, so I've set up overflow to the disk,
>>>>>> assuming, that the memory will be freed up when more and more data would 
>>>>>> be
>>>>>> coming.
>>>>>> Is my assumption wrong? Or I do need to have RAM for all the data?
>>>>>>
>>>>>> Thanks,
>>>>>> Eugene
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 26, 2016 at 2:04 PM, Barry Oglesby <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> The VersionedThinDiskRegionEntryHeapObjectKey are your region
>>>>>>> entries (your data). When you restart your server, it recovers that data
>>>>>>> from disk and stores it in those Region entries. Are you not meaning to
>>>>>>> persist your data?
>>>>>>>
>>>>>>> If I run a quick test with 1432 objects with ~120k data size and
>>>>>>> non-primitive keys, a histogram shows output like below. I deleted most 
>>>>>>> of
>>>>>>> the lines that are not relevant. You can see there are 1432
>>>>>>> VersionedThinDiskRegionEntryHeapObjectKeys, TradeKeys (my key) and
>>>>>>> VMCachedDeserializables (these are wrappers on the value). You should 
>>>>>>> see
>>>>>>> something similar. The byte arrays and character arrays are most of my 
>>>>>>> data.
>>>>>>>
>>>>>>> If you configure your regions to not be persistent, you won't see
>>>>>>> any of this upon recovery.
>>>>>>>
>>>>>>>  num     #instances         #bytes  class name
>>>>>>> ----------------------------------------------
>>>>>>>    1:          3229      172532264  [B
>>>>>>>    2:         37058        3199464  [C
>>>>>>>   27:          1432          80192
>>>>>>>  
>>>>>>> com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
>>>>>>>   41:          1432          34368  TradeKey
>>>>>>>   42:          1432          34368
>>>>>>>  com.gemstone.gemfire.internal.cache.VMCachedDeserializable
>>>>>>> Total        256685      184447072
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Barry Oglesby
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 26, 2016 at 10:09 AM, Eugene Strokin <
>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>
>>>>>>>> Digging more into the problem, I've found that 91% of heap is taken
>>>>>>>> by:
>>>>>>>>
>>>>>>>> 1,432 instances of
>>>>>>>> *"com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey"*,
>>>>>>>> loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"* occupy 
>>>>>>>> *121,257,480
>>>>>>>> (91.26%)* bytes. These instances are referenced from one instance
>>>>>>>> of *"com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]"*,
>>>>>>>> loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"*
>>>>>>>>
>>>>>>>> *Keywords*
>>>>>>>>
>>>>>>>> sun.misc.Launcher$AppClassLoader @ 0xef589a90
>>>>>>>>
>>>>>>>>
>>>>>>>> com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
>>>>>>>>
>>>>>>>> com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]
>>>>>>>>
>>>>>>>>
>>>>>>>> 1,432 instances doesn't sound like a lot, but looks like those are
>>>>>>>> big instances, about 121k each. Maybe something wrong with my
>>>>>>>> configuration, and I can limit creating such instances?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Eugene
>>>>>>>>
>>>>>>>> On Mon, Apr 25, 2016 at 4:19 PM, Jens Deppe < <[email protected]>
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> I think you're looking at the wrong info in ps.
>>>>>>>>>
>>>>>>>>> What you're showing is the Virtual size (vsz) of memory. This is
>>>>>>>>> how much the process has requested, but that does not mean it is 
>>>>>>>>> actually
>>>>>>>>> using it. In fact, your output says that Java has reserved 3Gb of 
>>>>>>>>> memory,
>>>>>>>>> not 300Mb! You should instead look at the Resident Set Size (rss 
>>>>>>>>> option) as
>>>>>>>>> that will give you a much more accurate picture of what is actually 
>>>>>>>>> using
>>>>>>>>> real memory.
>>>>>>>>>
>>>>>>>>> Also, remember that the JVM also needs memory for loaded code
>>>>>>>>> (jars and classes), JITed code, thread stacks, etc. so when setting 
>>>>>>>>> your
>>>>>>>>> heap size you should take that into account too.
>>>>>>>>>
>>>>>>>>> Finally, especially on virtualized hardware and doubly so on small
>>>>>>>>> configs, make sure you *never, ever* end up swapping because that
>>>>>>>>> will really kill your performance.
>>>>>>>>>
>>>>>>>>> --Jens
>>>>>>>>>
>>>>>>>>> On Mon, Apr 25, 2016 at 12:32 PM, Anilkumar Gingade <
>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> >> It joined the cluster, and loaded data from overflow files.
>>>>>>>>>> Not sure if this makes the OS file-system (disk buffer/cache) to
>>>>>>>>>> consume memory...
>>>>>>>>>> When you say overflow, I am assuming you are initializing the
>>>>>>>>>> data/regions using persistence files, if so can you try without the
>>>>>>>>>> persistence...
>>>>>>>>>>
>>>>>>>>>> -Anil.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin <
>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> And when I'm checking memory usage per process, it looks normal,
>>>>>>>>>>> java took only 300Mb as it supposed to, but free -m still shows no 
>>>>>>>>>>> memory:
>>>>>>>>>>>
>>>>>>>>>>> # ps axo pid,vsz,comm=|sort -n -k 2
>>>>>>>>>>>   PID    VSZ
>>>>>>>>>>>   465  26396 systemd-logind
>>>>>>>>>>>   444  26724 dbus-daemon
>>>>>>>>>>>   454  27984 avahi-daemon
>>>>>>>>>>>   443  28108 avahi-daemon
>>>>>>>>>>>   344  32720 systemd-journal
>>>>>>>>>>>     1  41212 systemd
>>>>>>>>>>>   364  43132 systemd-udevd
>>>>>>>>>>> 27138  52688 sftp-server
>>>>>>>>>>>   511  53056 wpa_supplicant
>>>>>>>>>>>   769  82548 sshd
>>>>>>>>>>> 30734  83972 sshd
>>>>>>>>>>>  1068  91128 master
>>>>>>>>>>> 28534  91232 pickup
>>>>>>>>>>>  1073  91300 qmgr
>>>>>>>>>>>   519 110032 agetty
>>>>>>>>>>> 27029 115380 bash
>>>>>>>>>>> 27145 115380 bash
>>>>>>>>>>> 30736 116440 sort
>>>>>>>>>>>   385 116720 auditd
>>>>>>>>>>>   489 126332 crond
>>>>>>>>>>> 30733 139624 sshd
>>>>>>>>>>> 27027 140840 sshd
>>>>>>>>>>> 27136 140840 sshd
>>>>>>>>>>> 27143 140840 sshd
>>>>>>>>>>> 30735 148904 ps
>>>>>>>>>>>   438 242360 rsyslogd
>>>>>>>>>>>   466 447932 NetworkManager
>>>>>>>>>>>   510 527448 polkitd
>>>>>>>>>>>   770 553060 tuned
>>>>>>>>>>> 30074 2922460 java
>>>>>>>>>>>
>>>>>>>>>>> # free -m
>>>>>>>>>>>               total        used        free      shared
>>>>>>>>>>>  buff/cache   available
>>>>>>>>>>> Mem:            489         424           5           0
>>>>>>>>>>>  58          41
>>>>>>>>>>> Swap:           255          57         198
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin <
>>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> thanks for your help, but I still struggling with the System
>>>>>>>>>>>> OOMKiller issue.
>>>>>>>>>>>> I was doing more digging. And still couldn't find the problem.
>>>>>>>>>>>> All settings are normal overcommit_memory=0,
>>>>>>>>>>>> overcommit_ratio=50.
>>>>>>>>>>>> free -m before the process starts:
>>>>>>>>>>>>
>>>>>>>>>>>> # free -m
>>>>>>>>>>>>               total        used        free      shared
>>>>>>>>>>>>  buff/cache   available
>>>>>>>>>>>> Mem:            489          25         399           1
>>>>>>>>>>>>  63         440
>>>>>>>>>>>> Swap:           255          57         198
>>>>>>>>>>>>
>>>>>>>>>>>> I start my process like this:
>>>>>>>>>>>>
>>>>>>>>>>>> *java* -server -Xmx300m -Xms300m
>>>>>>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC
>>>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=55 -jar /opt/ccio-image.jar
>>>>>>>>>>>>
>>>>>>>>>>>> So, I should still have about 99Mb of free memory, but:
>>>>>>>>>>>>
>>>>>>>>>>>> # free -m
>>>>>>>>>>>>               total        used        free      shared
>>>>>>>>>>>>  buff/cache   available
>>>>>>>>>>>> Mem:            489         409           6           1
>>>>>>>>>>>>  73          55
>>>>>>>>>>>> Swap:           255          54         201
>>>>>>>>>>>>
>>>>>>>>>>>> And I didn't even make a single call to the process yet. It
>>>>>>>>>>>> joined the cluster, and loaded data from overflow files. And all 
>>>>>>>>>>>> my free
>>>>>>>>>>>> memory is gone. Even though I've set 300Mb max for Java.
>>>>>>>>>>>> As I mentioned before, I've set off-heap memory setting to
>>>>>>>>>>>> false:
>>>>>>>>>>>>
>>>>>>>>>>>> Cache cache = new CacheFactory()
>>>>>>>>>>>> .set("locators", LOCATORS.get())
>>>>>>>>>>>> .set("start-locator",
>>>>>>>>>>>> LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
>>>>>>>>>>>> .set("bind-address", LOCATOR_IP.get())
>>>>>>>>>>>> .create();
>>>>>>>>>>>>
>>>>>>>>>>>> cache.createDiskStoreFactory()
>>>>>>>>>>>> .setMaxOplogSize(500)
>>>>>>>>>>>> .setDiskDirsAndSizes(new File[] { new
>>>>>>>>>>>> File("/opt/ccio/geode/store") }, new int[] { 18000 })
>>>>>>>>>>>> .setCompactionThreshold(95)
>>>>>>>>>>>> .create("-ccio-store");
>>>>>>>>>>>>
>>>>>>>>>>>> RegionFactory<String, byte[]> regionFactory =
>>>>>>>>>>>> cache.createRegionFactory();
>>>>>>>>>>>>
>>>>>>>>>>>> Region<String, byte[]> region = regionFactory
>>>>>>>>>>>> .setDiskStoreName("-ccio-store")
>>>>>>>>>>>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>>>>>>>>> .setOffHeap(false)
>>>>>>>>>>>> .setMulticastEnabled(false)
>>>>>>>>>>>> .setCacheLoader(new AwsS3CacheLoader())
>>>>>>>>>>>> .create("ccio-images");
>>>>>>>>>>>>
>>>>>>>>>>>> I don't understand how the memory is getting overcommitted.
>>>>>>>>>>>>
>>>>>>>>>>>> Eugene
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby <
>>>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The OOM killer uses the overcommit_memory and overcommit_ratio
>>>>>>>>>>>>> parameters to determine if / when to kill a process.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What are the settings for these parameters in your environment?
>>>>>>>>>>>>>
>>>>>>>>>>>>> The defaults are 0 and 50.
>>>>>>>>>>>>>
>>>>>>>>>>>>> cat /proc/sys/vm/overcommit_memory
>>>>>>>>>>>>> 0
>>>>>>>>>>>>>
>>>>>>>>>>>>> cat /proc/sys/vm/overcommit_ratio
>>>>>>>>>>>>> 50
>>>>>>>>>>>>>
>>>>>>>>>>>>> How much free memory is available before you start the JVM?
>>>>>>>>>>>>>
>>>>>>>>>>>>> How much free memory is available when your process is killed?
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can monitor free memory using either free or vmstat before
>>>>>>>>>>>>> and during your test.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Run free -m in a loop to monitor free memory like:
>>>>>>>>>>>>>
>>>>>>>>>>>>> free -ms2
>>>>>>>>>>>>>              total       used       free     shared    buffers
>>>>>>>>>>>>>     cached
>>>>>>>>>>>>> Mem:        290639      35021     255617          0       9215
>>>>>>>>>>>>>      21396
>>>>>>>>>>>>> -/+ buffers/cache:       4408     286230
>>>>>>>>>>>>> Swap:        20473          0      20473
>>>>>>>>>>>>>
>>>>>>>>>>>>> Run vmstat in a loop to monitor memory like:
>>>>>>>>>>>>>
>>>>>>>>>>>>> vmstat -SM 2
>>>>>>>>>>>>> procs -----------memory---------- ---swap-- -----io----
>>>>>>>>>>>>> --system-- -----cpu-----
>>>>>>>>>>>>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in
>>>>>>>>>>>>> cs us sy id wa st
>>>>>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0    23    0
>>>>>>>>>>>>>  0  2  0 98  0  0
>>>>>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  121
>>>>>>>>>>>>>  198  0  0 100  0  0
>>>>>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  102
>>>>>>>>>>>>>  189  0  0 100  0  0
>>>>>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  110
>>>>>>>>>>>>>  195  0  0 100  0  0
>>>>>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  117
>>>>>>>>>>>>>  205  0  0 100  0  0
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Barry Oglesby
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith <
>>>>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The java metaspace will also take up memory. Maybe try
>>>>>>>>>>>>>> setting -XX:MaxMetaspaceSize
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Dan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -------- Original message --------
>>>>>>>>>>>>>> From: Eugene Strokin < <[email protected]>
>>>>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>>> Date: 4/22/2016 4:34 PM (GMT-08:00)
>>>>>>>>>>>>>> To: <[email protected]>
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> Subject: Re: System Out of Memory
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The machine is small, it has only 512mb RAM, plus 256mb swap.
>>>>>>>>>>>>>> But java is set max heap size to 400mb. I've tried less, no
>>>>>>>>>>>>>> help. And the most interesting part is that I don't see Java OOM 
>>>>>>>>>>>>>> Exceptions
>>>>>>>>>>>>>> at all. I even included a code with memory leak, and I saw the 
>>>>>>>>>>>>>> Java OOM
>>>>>>>>>>>>>> Exceptions before the java process got killed then.
>>>>>>>>>>>>>> I've browsed internet, and some people are actually noticed
>>>>>>>>>>>>>> the same problem with other frameworks, not Geode. So, I'm 
>>>>>>>>>>>>>> suspecting this
>>>>>>>>>>>>>> could be not Geode, but Geode was the first suspect because it 
>>>>>>>>>>>>>> has off-heap
>>>>>>>>>>>>>> storage feature. They say that there was a memory leak, but for 
>>>>>>>>>>>>>> some reason
>>>>>>>>>>>>>> OS was killing the process even before Java was getting OOM,
>>>>>>>>>>>>>> I'll connect with JProbe, and will be monitoring the system
>>>>>>>>>>>>>> with the console. Will let you know if I'll find something 
>>>>>>>>>>>>>> interesting.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Eugene
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Apr 22, 2016 at 5:55 PM, Dan Smith <
>>>>>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What's your -Xmx for your JVM set to, and how much memory
>>>>>>>>>>>>>>> does your
>>>>>>>>>>>>>>> droplet have? Does it have any swap space? My guess is you
>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>> reduce the heap size of your JVM and the OS is killing your
>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>> because there is not enough memory left.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Dan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Apr 22, 2016 at 1:55 PM, Darrel Schneider <
>>>>>>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>>>>>> > I don't know why your OS would be killing your process
>>>>>>>>>>>>>>> which seems like your
>>>>>>>>>>>>>>> > main problem.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > But I did want you to know that if you don't have any
>>>>>>>>>>>>>>> regions with
>>>>>>>>>>>>>>> > off-heap=true then you have no reason to have
>>>>>>>>>>>>>>> off-heap-memory-size to be set
>>>>>>>>>>>>>>> > to anything other than 0.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > On Fri, Apr 22, 2016 at 12:48 PM, Eugene Strokin <
>>>>>>>>>>>>>>> <[email protected]>[email protected]>
>>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> I'm running load tests on the Geode cluster I've built.
>>>>>>>>>>>>>>> >> The OS is killing my process occasionally, complaining
>>>>>>>>>>>>>>> that the process
>>>>>>>>>>>>>>> >> takes too much memory:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> # dmesg
>>>>>>>>>>>>>>> >> [ 2544.932226 <%5B%202544.932226>] Out of memory: Kill
>>>>>>>>>>>>>>> process 5382 (java) score 780 or
>>>>>>>>>>>>>>> >> sacrifice child
>>>>>>>>>>>>>>> >> [ 2544.933591 <%5B%202544.933591>] Killed process 5382
>>>>>>>>>>>>>>> (java) total-vm:3102804kB,
>>>>>>>>>>>>>>> >> anon-rss:335780kB, file-rss:0kB
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Java doesn't have any problems, I don't see OOM exception.
>>>>>>>>>>>>>>> >> Looks like Geode is using off-heap memory. But I set
>>>>>>>>>>>>>>> offHeap to false for
>>>>>>>>>>>>>>> >> my region, and I do have only one region:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> RegionFactory<String, byte[]> regionFactory =
>>>>>>>>>>>>>>> cache.createRegionFactory();
>>>>>>>>>>>>>>> >> regionFactory
>>>>>>>>>>>>>>> >> .setDiskStoreName("-ccio-store")
>>>>>>>>>>>>>>> >> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>>>>>>>>>>>> >> .setOffHeap(false)
>>>>>>>>>>>>>>> >> .setCacheLoader(new AwsS3CacheLoader());
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Also, I've played with off-heap-memory-size setting,
>>>>>>>>>>>>>>> setting it to small
>>>>>>>>>>>>>>> >> number like 20M to prevent Geode to take too much
>>>>>>>>>>>>>>> off-heap memory, but
>>>>>>>>>>>>>>> >> result is the same.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Do you have any other ideas what could I do here? I'm
>>>>>>>>>>>>>>> stack at this point.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Thank you,
>>>>>>>>>>>>>>> >> Eugene
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -John
>>>> 503-504-8657
>>>> john.blum10101 (skype)
>>>>
>>>
>>>
>>
>

Re: System Out of Memory

Reply via email to