Re: System Out of Memory

John Blum Tue, 26 Apr 2016 12:58:33 -0700

No, even if you set a Disk Store, it does not even necessarily mean your
Region is persistent!  Check out GemFire's different RegionShortcuts...


http://data-docs-samples.cfapps.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/RegionShortcut.html

The affects of which (in terms of DataPolicy, Overflow, Eviction,
Expiration) can be seen here...

https://github.com/apache/incubator-geode/blob/rel/v1.0.0-incubating.M2/geode-core/src/main/java/com/gemstone/gemfire/internal/cache/GemFireCacheImpl.java#L4814-5019

You can, of course, specify a RegionShortcut (instead of DataPolicy) when
constructing a Region using the factory...

http://data-docs-samples.cfapps.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/Cache.html#createRegionFactory(com.gemstone.gemfire.cache.RegionShortcut)




On Tue, Apr 26, 2016 at 12:46 PM, Eugene Strokin <[email protected]>
wrote:

> Right, this is the region I'm still using. And the disk store looks like
> this:
>
> Cache cache = new CacheFactory()
> .set("locators", LOCATORS.get())
> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
> .set("bind-address", LOCATOR_IP.get())
> .create();
>
> cache.createDiskStoreFactory()
> .setMaxOplogSize(500)
> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") }, new
> int[] { 18000 })
> .setCompactionThreshold(95)
> .create("-ccio-store");
>
> RegionFactory<String, byte[]> regionFactory = cache.createRegionFactory();
> Region<String, byte[]> region = regionFactory
> .setDiskStoreName("-ccio-store")
> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
> .setOffHeap(false)
> .setMulticastEnabled(false)
> .setCacheLoader(new AwsS3CacheLoader())
> .create("ccio-images");
>
> I thought, since I have disk store specified the overflow if set.
> Please correct me if I'm wrong.
>
> Thank you,
> Eugene
>
> On Tue, Apr 26, 2016 at 3:40 PM, Udo Kohlmeyer <[email protected]>
> wrote:
>
>> Hi there Eugene,
>>
>> Geode will try and keep as much data in memory as it can, depending on
>> LRU eviction strategy. Once data is overflowed to disk, the memory for the
>> "value" would be freed up once GC has run.
>>
>> Is this still the correct region configuration you are using?
>>
>>  Region<String, byte[]> region = regionFactory
>> .setDiskStoreName("-ccio-store")
>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>> .setOffHeap(false)
>> .setMulticastEnabled(false)
>> .setCacheLoader(new AwsS3CacheLoader())
>> .create("ccio-images");
>>
>> If not could you please provide your current config you are testing with?
>> Because this config does not enable overflow.
>>
>>
>> <http://geode.docs.pivotal.io/docs/reference/topics/memory_requirements_guidelines_and_calc.html#topic_ac4_mtz_j4>
>> --Udo
>>
>>
>> On 27/04/2016 4:51 am, Eugene Strokin wrote:
>>
>> Right, I do have 1432 objects in my cache. But I thought, only the keys
>> will be in the memory, but the actual data would still be on the disk, and
>> when a client would try to get it, the data would be retrieved from the
>> storage.
>> I'm expecting to keep millions records in the cache, but I don't have
>> memory to keep all of them in there, so I've set up overflow to the disk,
>> assuming, that the memory will be freed up when more and more data would be
>> coming.
>> Is my assumption wrong? Or I do need to have RAM for all the data?
>>
>> Thanks,
>> Eugene
>>
>>
>> On Tue, Apr 26, 2016 at 2:04 PM, Barry Oglesby <[email protected]>
>> wrote:
>>
>>> The VersionedThinDiskRegionEntryHeapObjectKey are your region entries
>>> (your data). When you restart your server, it recovers that data from disk
>>> and stores it in those Region entries. Are you not meaning to persist your
>>> data?
>>>
>>> If I run a quick test with 1432 objects with ~120k data size and
>>> non-primitive keys, a histogram shows output like below. I deleted most of
>>> the lines that are not relevant. You can see there are 1432
>>> VersionedThinDiskRegionEntryHeapObjectKeys, TradeKeys (my key) and
>>> VMCachedDeserializables (these are wrappers on the value). You should see
>>> something similar. The byte arrays and character arrays are most of my data.
>>>
>>> If you configure your regions to not be persistent, you won't see any of
>>> this upon recovery.
>>>
>>>  num     #instances         #bytes  class name
>>> ----------------------------------------------
>>>    1:          3229      172532264  [B
>>>    2:         37058        3199464  [C
>>>   27:          1432          80192
>>>  
>>> com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
>>>   41:          1432          34368  TradeKey
>>>   42:          1432          34368
>>>  com.gemstone.gemfire.internal.cache.VMCachedDeserializable
>>> Total        256685      184447072
>>>
>>>
>>> Thanks,
>>> Barry Oglesby
>>>
>>>
>>> On Tue, Apr 26, 2016 at 10:09 AM, Eugene Strokin < <[email protected]>
>>> [email protected]> wrote:
>>>
>>>> Digging more into the problem, I've found that 91% of heap is taken by:
>>>>
>>>> 1,432 instances of
>>>> *"com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey"*,
>>>> loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"* occupy 
>>>> *121,257,480
>>>> (91.26%)* bytes. These instances are referenced from one instance of
>>>> *"com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]"*, loaded by 
>>>> *"sun.misc.Launcher$AppClassLoader
>>>> @ 0xef589a90"*
>>>>
>>>> *Keywords*
>>>>
>>>> sun.misc.Launcher$AppClassLoader @ 0xef589a90
>>>>
>>>>
>>>> com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
>>>>
>>>> com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]
>>>>
>>>>
>>>> 1,432 instances doesn't sound like a lot, but looks like those are big
>>>> instances, about 121k each. Maybe something wrong with my configuration,
>>>> and I can limit creating such instances?
>>>>
>>>> Thanks,
>>>> Eugene
>>>>
>>>> On Mon, Apr 25, 2016 at 4:19 PM, Jens Deppe < <[email protected]>
>>>> [email protected]> wrote:
>>>>
>>>>> I think you're looking at the wrong info in ps.
>>>>>
>>>>> What you're showing is the Virtual size (vsz) of memory. This is how
>>>>> much the process has requested, but that does not mean it is actually 
>>>>> using
>>>>> it. In fact, your output says that Java has reserved 3Gb of memory, not
>>>>> 300Mb! You should instead look at the Resident Set Size (rss option) as
>>>>> that will give you a much more accurate picture of what is actually using
>>>>> real memory.
>>>>>
>>>>> Also, remember that the JVM also needs memory for loaded code (jars
>>>>> and classes), JITed code, thread stacks, etc. so when setting your heap
>>>>> size you should take that into account too.
>>>>>
>>>>> Finally, especially on virtualized hardware and doubly so on small
>>>>> configs, make sure you *never, ever* end up swapping because that
>>>>> will really kill your performance.
>>>>>
>>>>> --Jens
>>>>>
>>>>> On Mon, Apr 25, 2016 at 12:32 PM, Anilkumar Gingade <
>>>>> <[email protected]>[email protected]> wrote:
>>>>>
>>>>>> >> It joined the cluster, and loaded data from overflow files.
>>>>>> Not sure if this makes the OS file-system (disk buffer/cache) to
>>>>>> consume memory...
>>>>>> When you say overflow, I am assuming you are initializing the
>>>>>> data/regions using persistence files, if so can you try without the
>>>>>> persistence...
>>>>>>
>>>>>> -Anil.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin <
>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>
>>>>>>> And when I'm checking memory usage per process, it looks normal,
>>>>>>> java took only 300Mb as it supposed to, but free -m still shows no 
>>>>>>> memory:
>>>>>>>
>>>>>>> # ps axo pid,vsz,comm=|sort -n -k 2
>>>>>>>   PID    VSZ
>>>>>>>   465  26396 systemd-logind
>>>>>>>   444  26724 dbus-daemon
>>>>>>>   454  27984 avahi-daemon
>>>>>>>   443  28108 avahi-daemon
>>>>>>>   344  32720 systemd-journal
>>>>>>>     1  41212 systemd
>>>>>>>   364  43132 systemd-udevd
>>>>>>> 27138  52688 sftp-server
>>>>>>>   511  53056 wpa_supplicant
>>>>>>>   769  82548 sshd
>>>>>>> 30734  83972 sshd
>>>>>>>  1068  91128 master
>>>>>>> 28534  91232 pickup
>>>>>>>  1073  91300 qmgr
>>>>>>>   519 110032 agetty
>>>>>>> 27029 115380 bash
>>>>>>> 27145 115380 bash
>>>>>>> 30736 116440 sort
>>>>>>>   385 116720 auditd
>>>>>>>   489 126332 crond
>>>>>>> 30733 139624 sshd
>>>>>>> 27027 140840 sshd
>>>>>>> 27136 140840 sshd
>>>>>>> 27143 140840 sshd
>>>>>>> 30735 148904 ps
>>>>>>>   438 242360 rsyslogd
>>>>>>>   466 447932 NetworkManager
>>>>>>>   510 527448 polkitd
>>>>>>>   770 553060 tuned
>>>>>>> 30074 2922460 java
>>>>>>>
>>>>>>> # free -m
>>>>>>>               total        used        free      shared  buff/cache
>>>>>>>   available
>>>>>>> Mem:            489         424           5           0          58
>>>>>>>          41
>>>>>>> Swap:           255          57         198
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin <
>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>
>>>>>>>> thanks for your help, but I still struggling with the System
>>>>>>>> OOMKiller issue.
>>>>>>>> I was doing more digging. And still couldn't find the problem.
>>>>>>>> All settings are normal overcommit_memory=0, overcommit_ratio=50.
>>>>>>>> free -m before the process starts:
>>>>>>>>
>>>>>>>> # free -m
>>>>>>>>               total        used        free      shared  buff/cache
>>>>>>>>   available
>>>>>>>> Mem:            489          25         399           1          63
>>>>>>>>         440
>>>>>>>> Swap:           255          57         198
>>>>>>>>
>>>>>>>> I start my process like this:
>>>>>>>>
>>>>>>>> *java* -server -Xmx300m -Xms300m -XX:+HeapDumpOnOutOfMemoryError
>>>>>>>> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=55 -jar
>>>>>>>> /opt/ccio-image.jar
>>>>>>>>
>>>>>>>> So, I should still have about 99Mb of free memory, but:
>>>>>>>>
>>>>>>>> # free -m
>>>>>>>>               total        used        free      shared  buff/cache
>>>>>>>>   available
>>>>>>>> Mem:            489         409           6           1          73
>>>>>>>>          55
>>>>>>>> Swap:           255          54         201
>>>>>>>>
>>>>>>>> And I didn't even make a single call to the process yet. It joined
>>>>>>>> the cluster, and loaded data from overflow files. And all my free 
>>>>>>>> memory is
>>>>>>>> gone. Even though I've set 300Mb max for Java.
>>>>>>>> As I mentioned before, I've set off-heap memory setting to false:
>>>>>>>>
>>>>>>>> Cache cache = new CacheFactory()
>>>>>>>> .set("locators", LOCATORS.get())
>>>>>>>> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
>>>>>>>> .set("bind-address", LOCATOR_IP.get())
>>>>>>>> .create();
>>>>>>>>
>>>>>>>> cache.createDiskStoreFactory()
>>>>>>>> .setMaxOplogSize(500)
>>>>>>>> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store")
>>>>>>>> }, new int[] { 18000 })
>>>>>>>> .setCompactionThreshold(95)
>>>>>>>> .create("-ccio-store");
>>>>>>>>
>>>>>>>> RegionFactory<String, byte[]> regionFactory =
>>>>>>>> cache.createRegionFactory();
>>>>>>>>
>>>>>>>> Region<String, byte[]> region = regionFactory
>>>>>>>> .setDiskStoreName("-ccio-store")
>>>>>>>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>>>>> .setOffHeap(false)
>>>>>>>> .setMulticastEnabled(false)
>>>>>>>> .setCacheLoader(new AwsS3CacheLoader())
>>>>>>>> .create("ccio-images");
>>>>>>>>
>>>>>>>> I don't understand how the memory is getting overcommitted.
>>>>>>>>
>>>>>>>> Eugene
>>>>>>>>
>>>>>>>> On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby <
>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>
>>>>>>>>> The OOM killer uses the overcommit_memory and overcommit_ratio
>>>>>>>>> parameters to determine if / when to kill a process.
>>>>>>>>>
>>>>>>>>> What are the settings for these parameters in your environment?
>>>>>>>>>
>>>>>>>>> The defaults are 0 and 50.
>>>>>>>>>
>>>>>>>>> cat /proc/sys/vm/overcommit_memory
>>>>>>>>> 0
>>>>>>>>>
>>>>>>>>> cat /proc/sys/vm/overcommit_ratio
>>>>>>>>> 50
>>>>>>>>>
>>>>>>>>> How much free memory is available before you start the JVM?
>>>>>>>>>
>>>>>>>>> How much free memory is available when your process is killed?
>>>>>>>>>
>>>>>>>>> You can monitor free memory using either free or vmstat before and
>>>>>>>>> during your test.
>>>>>>>>>
>>>>>>>>> Run free -m in a loop to monitor free memory like:
>>>>>>>>>
>>>>>>>>> free -ms2
>>>>>>>>>              total       used       free     shared    buffers
>>>>>>>>> cached
>>>>>>>>> Mem:        290639      35021     255617          0       9215
>>>>>>>>>  21396
>>>>>>>>> -/+ buffers/cache:       4408     286230
>>>>>>>>> Swap:        20473          0      20473
>>>>>>>>>
>>>>>>>>> Run vmstat in a loop to monitor memory like:
>>>>>>>>>
>>>>>>>>> vmstat -SM 2
>>>>>>>>> procs -----------memory---------- ---swap-- -----io---- --system--
>>>>>>>>> -----cpu-----
>>>>>>>>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs
>>>>>>>>> us sy id wa st
>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0    23    0    0
>>>>>>>>>  2  0 98  0  0
>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  121  198
>>>>>>>>>  0  0 100  0  0
>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  102  189
>>>>>>>>>  0  0 100  0  0
>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  110  195
>>>>>>>>>  0  0 100  0  0
>>>>>>>>>  0  0      0 255619   9215  21396    0    0     0     0  117  205
>>>>>>>>>  0  0 100  0  0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Barry Oglesby
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith < <[email protected]>
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> The java metaspace will also take up memory. Maybe try setting
>>>>>>>>>> -XX:MaxMetaspaceSize
>>>>>>>>>>
>>>>>>>>>> -Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -------- Original message --------
>>>>>>>>>> From: Eugene Strokin < <[email protected]>[email protected]>
>>>>>>>>>> Date: 4/22/2016 4:34 PM (GMT-08:00)
>>>>>>>>>> To: <[email protected]>
>>>>>>>>>> [email protected]
>>>>>>>>>> Subject: Re: System Out of Memory
>>>>>>>>>>
>>>>>>>>>> The machine is small, it has only 512mb RAM, plus 256mb swap.
>>>>>>>>>> But java is set max heap size to 400mb. I've tried less, no help.
>>>>>>>>>> And the most interesting part is that I don't see Java OOM 
>>>>>>>>>> Exceptions at
>>>>>>>>>> all. I even included a code with memory leak, and I saw the Java OOM
>>>>>>>>>> Exceptions before the java process got killed then.
>>>>>>>>>> I've browsed internet, and some people are actually noticed the
>>>>>>>>>> same problem with other frameworks, not Geode. So, I'm suspecting 
>>>>>>>>>> this
>>>>>>>>>> could be not Geode, but Geode was the first suspect because it has 
>>>>>>>>>> off-heap
>>>>>>>>>> storage feature. They say that there was a memory leak, but for some 
>>>>>>>>>> reason
>>>>>>>>>> OS was killing the process even before Java was getting OOM,
>>>>>>>>>> I'll connect with JProbe, and will be monitoring the system with
>>>>>>>>>> the console. Will let you know if I'll find something interesting.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Eugene
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 22, 2016 at 5:55 PM, Dan Smith < <[email protected]>
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> What's your -Xmx for your JVM set to, and how much memory does
>>>>>>>>>>> your
>>>>>>>>>>> droplet have? Does it have any swap space? My guess is you need
>>>>>>>>>>> to
>>>>>>>>>>> reduce the heap size of your JVM and the OS is killing your
>>>>>>>>>>> process
>>>>>>>>>>> because there is not enough memory left.
>>>>>>>>>>>
>>>>>>>>>>> -Dan
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 22, 2016 at 1:55 PM, Darrel Schneider <
>>>>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>>>> > I don't know why your OS would be killing your process which
>>>>>>>>>>> seems like your
>>>>>>>>>>> > main problem.
>>>>>>>>>>> >
>>>>>>>>>>> > But I did want you to know that if you don't have any regions
>>>>>>>>>>> with
>>>>>>>>>>> > off-heap=true then you have no reason to have
>>>>>>>>>>> off-heap-memory-size to be set
>>>>>>>>>>> > to anything other than 0.
>>>>>>>>>>> >
>>>>>>>>>>> > On Fri, Apr 22, 2016 at 12:48 PM, Eugene Strokin <
>>>>>>>>>>> <[email protected]>[email protected]>
>>>>>>>>>>> > wrote:
>>>>>>>>>>> >>
>>>>>>>>>>> >> I'm running load tests on the Geode cluster I've built.
>>>>>>>>>>> >> The OS is killing my process occasionally, complaining that
>>>>>>>>>>> the process
>>>>>>>>>>> >> takes too much memory:
>>>>>>>>>>> >>
>>>>>>>>>>> >> # dmesg
>>>>>>>>>>> >> [ 2544.932226 <%5B%202544.932226>] Out of memory: Kill
>>>>>>>>>>> process 5382 (java) score 780 or
>>>>>>>>>>> >> sacrifice child
>>>>>>>>>>> >> [ 2544.933591 <%5B%202544.933591>] Killed process 5382
>>>>>>>>>>> (java) total-vm:3102804kB,
>>>>>>>>>>> >> anon-rss:335780kB, file-rss:0kB
>>>>>>>>>>> >>
>>>>>>>>>>> >> Java doesn't have any problems, I don't see OOM exception.
>>>>>>>>>>> >> Looks like Geode is using off-heap memory. But I set offHeap
>>>>>>>>>>> to false for
>>>>>>>>>>> >> my region, and I do have only one region:
>>>>>>>>>>> >>
>>>>>>>>>>> >> RegionFactory<String, byte[]> regionFactory =
>>>>>>>>>>> cache.createRegionFactory();
>>>>>>>>>>> >> regionFactory
>>>>>>>>>>> >> .setDiskStoreName("-ccio-store")
>>>>>>>>>>> >> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>>>>>>>> >> .setOffHeap(false)
>>>>>>>>>>> >> .setCacheLoader(new AwsS3CacheLoader());
>>>>>>>>>>> >>
>>>>>>>>>>> >> Also, I've played with off-heap-memory-size setting, setting
>>>>>>>>>>> it to small
>>>>>>>>>>> >> number like 20M to prevent Geode to take too much off-heap
>>>>>>>>>>> memory, but
>>>>>>>>>>> >> result is the same.
>>>>>>>>>>> >>
>>>>>>>>>>> >> Do you have any other ideas what could I do here? I'm stack
>>>>>>>>>>> at this point.
>>>>>>>>>>> >>
>>>>>>>>>>> >> Thank you,
>>>>>>>>>>> >> Eugene
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>


-- 
-John
503-504-8657
john.blum10101 (skype)

Re: System Out of Memory

Reply via email to