Re: System Out of Memory

Anthony Baker Tue, 26 Apr 2016 12:57:06 -0700

Here’s what I do:

// only one value in memory, rest are overflowed to disk
return cache.<String, 
String>createRegionFactory(RegionShortcut.PARTITION_PERSISTENT_OVERFLOW)
          
.setEvictionAttributes(EvictionAttributes.createLIFOEntryAttributes(1, 
EvictionAction.OVERFLOW_TO_DISK))
          .setDiskStoreName(diskStore)
          .create(name);



> On Apr 26, 2016, at 12:46 PM, Eugene Strokin <[email protected]> wrote:
> 
> Right, this is the region I'm still using. And the disk store looks like this:
> 
> Cache cache = new CacheFactory()
> .set("locators", LOCATORS.get())
> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
> .set("bind-address", LOCATOR_IP.get())
> .create();
> 
> cache.createDiskStoreFactory()
> .setMaxOplogSize(500)
> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") }, new 
> int[] { 18000 })
> .setCompactionThreshold(95)
> .create("-ccio-store");
> 
> RegionFactory<String, byte[]> regionFactory = cache.createRegionFactory();
> Region<String, byte[]> region = regionFactory
> .setDiskStoreName("-ccio-store")
> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
> .setOffHeap(false)
> .setMulticastEnabled(false)
> .setCacheLoader(new AwsS3CacheLoader())
> .create("ccio-images");
> 
> I thought, since I have disk store specified the overflow if set.
> Please correct me if I'm wrong.
> 
> Thank you,
> Eugene
> 
> On Tue, Apr 26, 2016 at 3:40 PM, Udo Kohlmeyer <[email protected]> wrote:
> Hi there Eugene,
> 
> Geode will try and keep as much data in memory as it can, depending on LRU 
> eviction strategy. Once data is overflowed to disk, the memory for the 
> "value" would be freed up once GC has run.
> 
> Is this still the correct region configuration you are using?
> 
>  Region<String, byte[]> region = regionFactory
> .setDiskStoreName("-ccio-store")
> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
> .setOffHeap(false)
> .setMulticastEnabled(false)
> .setCacheLoader(new AwsS3CacheLoader())
> .create("ccio-images");
> 
> If not could you please provide your current config you are testing with? 
> Because this config does not enable overflow.
> 
> --Udo
> 
> 
> On 27/04/2016 4:51 am, Eugene Strokin wrote:
>> Right, I do have 1432 objects in my cache. But I thought, only the keys will 
>> be in the memory, but the actual data would still be on the disk, and when a 
>> client would try to get it, the data would be retrieved from the storage.
>> I'm expecting to keep millions records in the cache, but I don't have memory 
>> to keep all of them in there, so I've set up overflow to the disk, assuming, 
>> that the memory will be freed up when more and more data would be coming.
>> Is my assumption wrong? Or I do need to have RAM for all the data?
>> 
>> Thanks,
>> Eugene
>> 
>> 
>> On Tue, Apr 26, 2016 at 2:04 PM, Barry Oglesby <[email protected]> wrote:
>> The VersionedThinDiskRegionEntryHeapObjectKey are your region entries (your 
>> data). When you restart your server, it recovers that data from disk and 
>> stores it in those Region entries. Are you not meaning to persist your data?
>> 
>> If I run a quick test with 1432 objects with ~120k data size and 
>> non-primitive keys, a histogram shows output like below. I deleted most of 
>> the lines that are not relevant. You can see there are 1432 
>> VersionedThinDiskRegionEntryHeapObjectKeys, TradeKeys (my key) and 
>> VMCachedDeserializables (these are wrappers on the value). You should see 
>> something similar. The byte arrays and character arrays are most of my data.
>> 
>> If you configure your regions to not be persistent, you won't see any of 
>> this upon recovery.
>> 
>>  num     #instances         #bytes  class name
>> ----------------------------------------------
>>    1:          3229      172532264  [B
>>    2:         37058        3199464  [C
>>   27:          1432          80192  
>> com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
>>   41:          1432          34368  TradeKey
>>   42:          1432          34368  
>> com.gemstone.gemfire.internal.cache.VMCachedDeserializable
>> Total        256685      184447072
>> 
>> 
>> Thanks,
>> Barry Oglesby
>> 
>> 
>> On Tue, Apr 26, 2016 at 10:09 AM, Eugene Strokin <[email protected]> wrote:
>> Digging more into the problem, I've found that 91% of heap is taken by:
>> 
>> 1,432 instances of 
>> "com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey",
>>  loaded by "sun.misc.Launcher$AppClassLoader @ 0xef589a90" occupy 
>> 121,257,480 (91.26%) bytes. These instances are referenced from one instance 
>> of "com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]", loaded by 
>> "sun.misc.Launcher$AppClassLoader @ 0xef589a90"
>> 
>> Keywords
>> 
>> sun.misc.Launcher$AppClassLoader @ 0xef589a90
>> 
>> com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
>> 
>> com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]
>> 
>> 
>> 
>> 1,432 instances doesn't sound like a lot, but looks like those are big 
>> instances, about 121k each. Maybe something wrong with my configuration, and 
>> I can limit creating such instances?
>> 
>> Thanks,
>> Eugene
>> 
>> On Mon, Apr 25, 2016 at 4:19 PM, Jens Deppe <[email protected]> wrote:
>> I think you're looking at the wrong info in ps.
>> 
>> What you're showing is the Virtual size (vsz) of memory. This is how much 
>> the process has requested, but that does not mean it is actually using it. 
>> In fact, your output says that Java has reserved 3Gb of memory, not 300Mb! 
>> You should instead look at the Resident Set Size (rss option) as that will 
>> give you a much more accurate picture of what is actually using real memory.
>> 
>> Also, remember that the JVM also needs memory for loaded code (jars and 
>> classes), JITed code, thread stacks, etc. so when setting your heap size you 
>> should take that into account too.
>> 
>> Finally, especially on virtualized hardware and doubly so on small configs, 
>> make sure you never, ever end up swapping because that will really kill your 
>> performance.
>> 
>> --Jens
>> 
>> On Mon, Apr 25, 2016 at 12:32 PM, Anilkumar Gingade <[email protected]> 
>> wrote:
>> >> It joined the cluster, and loaded data from overflow files.
>> Not sure if this makes the OS file-system (disk buffer/cache) to consume 
>> memory...
>> When you say overflow, I am assuming you are initializing the data/regions 
>> using persistence files, if so can you try without the persistence...
>> 
>> -Anil.
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin <[email protected]> wrote:
>> And when I'm checking memory usage per process, it looks normal, java took 
>> only 300Mb as it supposed to, but free -m still shows no memory:
>> 
>> # ps axo pid,vsz,comm=|sort -n -k 2
>>   PID    VSZ
>>   465  26396 systemd-logind
>>   444  26724 dbus-daemon
>>   454  27984 avahi-daemon
>>   443  28108 avahi-daemon
>>   344  32720 systemd-journal
>>     1  41212 systemd
>>   364  43132 systemd-udevd
>> 27138  52688 sftp-server
>>   511  53056 wpa_supplicant
>>   769  82548 sshd
>> 30734  83972 sshd
>>  1068  91128 master
>> 28534  91232 pickup
>>  1073  91300 qmgr
>>   519 110032 agetty
>> 27029 115380 bash
>> 27145 115380 bash
>> 30736 116440 sort
>>   385 116720 auditd
>>   489 126332 crond
>> 30733 139624 sshd
>> 27027 140840 sshd
>> 27136 140840 sshd
>> 27143 140840 sshd
>> 30735 148904 ps
>>   438 242360 rsyslogd
>>   466 447932 NetworkManager
>>   510 527448 polkitd
>>   770 553060 tuned
>> 30074 2922460 java
>> 
>> # free -m
>>               total        used        free      shared  buff/cache   
>> available
>> Mem:            489         424           5           0          58          
>> 41
>> Swap:           255          57         198
>> 
>> 
>> On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin <[email protected]> wrote:
>> thanks for your help, but I still struggling with the System OOMKiller issue.
>> I was doing more digging. And still couldn't find the problem.
>> All settings are normal overcommit_memory=0, overcommit_ratio=50.
>> free -m before the process starts:
>> 
>> # free -m
>>               total        used        free      shared  buff/cache   
>> available
>> Mem:            489          25         399           1          63         
>> 440
>> Swap:           255          57         198
>> 
>> I start my process like this:
>> java -server -Xmx300m -Xms300m -XX:+HeapDumpOnOutOfMemoryError 
>> -XX:+UseConcMarkSweepGC                                                      
>>      -XX:CMSInitiatingOccupancyFraction=55 -jar /opt/ccio-image.jar
>> 
>> 
>> So, I should still have about 99Mb of free memory, but:
>> 
>> # free -m
>>               total        used        free      shared  buff/cache   
>> available
>> Mem:            489         409           6           1          73          
>> 55
>> Swap:           255          54         201
>> 
>> And I didn't even make a single call to the process yet. It joined the 
>> cluster, and loaded data from overflow files. And all my free memory is 
>> gone. Even though I've set 300Mb max for Java.
>> As I mentioned before, I've set off-heap memory setting to false:
>> 
>> Cache cache = new CacheFactory()
>> .set("locators", LOCATORS.get())
>> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
>> .set("bind-address", LOCATOR_IP.get())
>> .create();
>> 
>> cache.createDiskStoreFactory()
>> .setMaxOplogSize(500)
>> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") }, new 
>> int[] { 18000 })
>> .setCompactionThreshold(95)
>> .create("-ccio-store");
>> 
>> RegionFactory<String, byte[]> regionFactory = cache.createRegionFactory();
>> 
>> Region<String, byte[]> region = regionFactory
>> .setDiskStoreName("-ccio-store")
>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>> .setOffHeap(false)
>> .setMulticastEnabled(false)
>> .setCacheLoader(new AwsS3CacheLoader())
>> .create("ccio-images");
>> 
>> I don't understand how the memory is getting overcommitted.
>> 
>> Eugene
>> 
>> On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby <[email protected]> wrote:
>> The OOM killer uses the overcommit_memory and overcommit_ratio parameters to 
>> determine if / when to kill a process.
>> 
>> What are the settings for these parameters in your environment?
>> 
>> The defaults are 0 and 50.
>> 
>> cat /proc/sys/vm/overcommit_memory
>> 0
>> 
>> cat /proc/sys/vm/overcommit_ratio
>> 50
>> 
>> How much free memory is available before you start the JVM?
>> 
>> How much free memory is available when your process is killed?
>> 
>> You can monitor free memory using either free or vmstat before and during 
>> your test.
>> 
>> Run free -m in a loop to monitor free memory like:
>> 
>> free -ms2
>>              total       used       free     shared    buffers     cached
>> Mem:        290639      35021     255617          0       9215      21396
>> -/+ buffers/cache:       4408     286230
>> Swap:        20473          0      20473
>> 
>> Run vmstat in a loop to monitor memory like:
>> 
>> vmstat -SM 2
>> procs -----------memory---------- ---swap-- -----io---- --system-- 
>> -----cpu-----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id 
>> wa st
>>  0  0      0 255619   9215  21396    0    0     0    23    0    0  2  0 98  
>> 0  0
>>  0  0      0 255619   9215  21396    0    0     0     0  121  198  0  0 100  
>> 0  0
>>  0  0      0 255619   9215  21396    0    0     0     0  102  189  0  0 100  
>> 0  0
>>  0  0      0 255619   9215  21396    0    0     0     0  110  195  0  0 100  
>> 0  0
>>  0  0      0 255619   9215  21396    0    0     0     0  117  205  0  0 100  
>> 0  0
>> 
>> 
>> Thanks,
>> Barry Oglesby
>> 
>> 
>> On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith <[email protected]> wrote:
>> The java metaspace will also take up memory. Maybe try setting 
>> -XX:MaxMetaspaceSize
>> 
>> -Dan
>> 
>> 
>> -------- Original message --------
>> From: Eugene Strokin <[email protected]>
>> Date: 4/22/2016 4:34 PM (GMT-08:00)
>> To: [email protected]
>> Subject: Re: System Out of Memory
>> 
>> The machine is small, it has only 512mb RAM, plus 256mb swap.
>> But java is set max heap size to 400mb. I've tried less, no help. And the 
>> most interesting part is that I don't see Java OOM Exceptions at all. I even 
>> included a code with memory leak, and I saw the Java OOM Exceptions before 
>> the java process got killed then.
>> I've browsed internet, and some people are actually noticed the same problem 
>> with other frameworks, not Geode. So, I'm suspecting this could be not 
>> Geode, but Geode was the first suspect because it has off-heap storage 
>> feature. They say that there was a memory leak, but for some reason OS was 
>> killing the process even before Java was getting OOM,
>> I'll connect with JProbe, and will be monitoring the system with the 
>> console. Will let you know if I'll find something                            
>>                                interesting.
>> 
>> Thanks,
>> Eugene
>> 
>> 
>> On Fri, Apr 22, 2016 at 5:55 PM, Dan Smith <[email protected]> wrote:
>> What's your -Xmx for your JVM set to, and how much memory does your
>> droplet have? Does it have any swap space? My guess is you need to
>> reduce the heap size of your JVM and the OS is killing your process
>> because there is not enough memory left.
>> 
>> -Dan
>> 
>> On Fri, Apr 22, 2016 at 1:55 PM, Darrel Schneider <[email protected]> 
>> wrote:
>> > I don't know why your OS would be killing your process which seems like 
>> > your
>> > main problem.
>> >
>> > But I did want you to know that if you don't have any regions with
>> > off-heap=true then you have no reason to have off-heap-memory-size to be 
>> > set
>> > to anything other than 0.
>> >
>> > On Fri, Apr 22, 2016 at 12:48 PM, Eugene Strokin <[email protected]>
>> > wrote:
>> >>
>> >> I'm running load tests on the Geode cluster I've built.
>> >> The OS is killing my process occasionally, complaining that the process
>> >> takes too much memory:
>> >>
>> >> # dmesg
>> >> [ 2544.932226] Out of memory: Kill process 5382 (java) score 780 or
>> >> sacrifice child
>> >> [ 2544.933591] Killed process 5382 (java) total-vm:3102804kB,
>> >> anon-rss:335780kB, file-rss:0kB
>> >>
>> >> Java doesn't have any problems, I don't see OOM exception.
>> >> Looks like Geode is using off-heap memory. But I set offHeap to false for
>> >> my region, and I do have only one region:
>> >>
>> >> RegionFactory<String, byte[]> regionFactory = cache.createRegionFactory();
>> >> regionFactory
>> >> .setDiskStoreName("-ccio-store")
>> >> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>> >> .setOffHeap(false)
>> >> .setCacheLoader(new AwsS3CacheLoader());
>> >>
>> >> Also, I've played with off-heap-memory-size setting, setting it to small
>> >> number like 20M to prevent Geode to take too much off-heap memory, but
>> >> result is the same.
>> >>
>> >> Do you have any other ideas what could I do here? I'm stack at this point.
>> >>
>> >> Thank you,
>> >> Eugene
>> >
>> >
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
>

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: System Out of Memory

Reply via email to