Re: System Out of Memory

Udo Kohlmeyer Tue, 26 Apr 2016 12:41:44 -0700

Hi there Eugene,

Geode will try and keep as much data in memory as it can, depending onLRU eviction strategy. Once data is overflowed to disk, the memory forthe "value" would be freed up once GC has run.


Is this still the correct region configuration you are using?

 Region<String, byte[]> region = regionFactory
.setDiskStoreName("-ccio-store")
.setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
.setOffHeap(false)
.setMulticastEnabled(false)
.setCacheLoader(new AwsS3CacheLoader())
.create("ccio-images");

If not could you please provide your current config you are testingwith? Because this config does not enable overflow.


<http://geode.docs.pivotal.io/docs/reference/topics/memory_requirements_guidelines_and_calc.html#topic_ac4_mtz_j4>--Udo

On 27/04/2016 4:51 am, Eugene Strokin wrote:

Right, I do have 1432 objects in my cache. But I thought, only thekeys will be in the memory, but the actual data would still be on thedisk, and when a client would try to get it, the data would beretrieved from the storage.I'm expecting to keep millions records in the cache, but I don't havememory to keep all of them in there, so I've set up overflow to thedisk, assuming, that the memory will be freed up when more and moredata would be coming.

Is my assumption wrong? Or I do need to have RAM for all the data?

Thanks,
Eugene

On Tue, Apr 26, 2016 at 2:04 PM, Barry Oglesby <[email protected]<mailto:[email protected]>> wrote:


    The VersionedThinDiskRegionEntryHeapObjectKey are your region
    entries (your data). When you restart your server, it recovers
    that data from disk and stores it in those Region entries. Are you
    not meaning to persist your data?

    If I run a quick test with 1432 objects with ~120k data size and
    non-primitive keys, a histogram shows output like below. I deleted
    most of the lines that are not relevant. You can see there are
    1432 VersionedThinDiskRegionEntryHeapObjectKeys, TradeKeys (my
    key) and VMCachedDeserializables (these are wrappers on the
    value). You should see something similar. The byte arrays and
    character arrays are most of my data.

    If you configure your regions to not be persistent, you won't see
    any of this upon recovery.

     num     #instances         #bytes  class name
    ----------------------------------------------
       1:          3229      172532264  [B
       2:         37058        3199464  [C
      27:          1432          80192
     
com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey
      41:          1432          34368  TradeKey
      42:          1432          34368
     com.gemstone.gemfire.internal.cache.VMCachedDeserializable
    Total        256685      184447072


    Thanks,
    Barry Oglesby


    On Tue, Apr 26, 2016 at 10:09 AM, Eugene Strokin
    <[email protected] <mailto:[email protected]>> wrote:

        Digging more into the problem, I've found that 91% of heap is
        taken by:

        1,432 instances of
        
*"com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey"*,
        loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"*
        occupy *121,257,480 (91.26%)* bytes. These instances are
        referenced from one instance of
        *"com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]"*,
        loaded by *"sun.misc.Launcher$AppClassLoader @ 0xef589a90"*

        *Keywords*

        sun.misc.Launcher$AppClassLoader @ 0xef589a90

        
com.gemstone.gemfire.internal.cache.VersionedThinDiskRegionEntryHeapObjectKey

        com.gemstone.gemfire.internal.cache.ProxyBucketRegion[]


        1,432 instances doesn't sound like a lot, but looks like those
        are big instances, about 121k each. Maybe something wrong with
        my configuration, and I can limit creating such instances?

        Thanks,
        Eugene

        On Mon, Apr 25, 2016 at 4:19 PM, Jens Deppe <[email protected]
        <mailto:[email protected]>> wrote:

            I think you're looking at the wrong info in ps.

            What you're showing is the Virtual size (vsz) of memory.
            This is how much the process has requested, but that does
            not mean it is actually using it. In fact, your output
            says that Java has reserved 3Gb of memory, not 300Mb! You
            should instead look at the Resident Set Size (rss option)
            as that will give you a much more accurate picture of what
            is actually using real memory.

            Also, remember that the JVM also needs memory for loaded
            code (jars and classes), JITed code, thread stacks, etc.
            so when setting your heap size you should take that into
            account too.

            Finally, especially on virtualized hardware and doubly so
            on small configs, make sure you *never, ever* end up
            swapping because that will really kill your performance.

            --Jens

            On Mon, Apr 25, 2016 at 12:32 PM, Anilkumar Gingade
            <[email protected] <mailto:[email protected]>> wrote:

                >> It joined the cluster, and loaded data from overflow
                files.
                Not sure if this makes the OS file-system (disk
                buffer/cache) to consume memory...
                When you say overflow, I am assuming you are
                initializing the data/regions using persistence files,
                if so can you try without the persistence...

                -Anil.





                On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin
                <[email protected] <mailto:[email protected]>> wrote:

                    And when I'm checking memory usage per process, it
                    looks normal, java took only 300Mb as it supposed
                    to, but free -m still shows no memory:

                    # ps axo pid,vsz,comm=|sort -n -k 2
                      PID    VSZ
                      465  26396 systemd-logind
                      444  26724 dbus-daemon
                      454  27984 avahi-daemon
                      443  28108 avahi-daemon
                      344  32720 systemd-journal
                        1  41212 systemd
                      364  43132 systemd-udevd
                    27138  52688 sftp-server
                      511  53056 wpa_supplicant
                      769  82548 sshd
                    30734  83972 sshd
                     1068  91128 master
                    28534  91232 pickup
                     1073  91300 qmgr
                      519 110032 agetty
                    27029 115380 bash
                    27145 115380 bash
                    30736 116440 sort
                      385 116720 auditd
                      489 126332 crond
                    30733 139624 sshd
                    27027 140840 sshd
                    27136 140840 sshd
                    27143 140840 sshd
                    30735 148904 ps
                      438 242360 rsyslogd
                      466 447932 NetworkManager
                      510 527448 polkitd
                      770 553060 tuned
                    30074 2922460 java

                    # free -m
                                  total        used  free      shared
                     buff/cache available

Mem: 489 424 5 058 41

                    Swap:           255          57 198


                    On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin
                    <[email protected] <mailto:[email protected]>>
                    wrote:

                        thanks for your help, but I still struggling
                        with the System OOMKiller issue.
                        I was doing more digging. And still couldn't
                        find the problem.
                        All settings are normal overcommit_memory=0,
                        overcommit_ratio=50.
                        free -m before the process starts:

                        # free -m
                        total  used  free  shared  buff/cache available
                        Mem:  489  25 399 1          63         440
                        Swap:   255  57 198

                        I start my process like this:

                        *java*-server -Xmx300m -Xms300m
                        -XX:+HeapDumpOnOutOfMemoryError
                        -XX:+UseConcMarkSweepGC
                        -XX:CMSInitiatingOccupancyFraction=55 -jar
                        /opt/ccio-image.jar


                        So, I should still have about 99Mb of free
                        memory, but:

                        # free -m
                        total  used  free  shared  buff/cache available
                        Mem:  489 409 6           1          73        55
                        Swap:   255  54 201

                        And I didn't even make a single call to the
                        process yet. It joined the cluster, and loaded
                        data from overflow files. And all my free
                        memory is gone. Even though I've set 300Mb max
                        for Java.
                        As I mentioned before, I've set off-heap
                        memory setting to false:

                        Cache cache = new CacheFactory()
                        .set("locators", LOCATORS.get())
                        .set("start-locator",
                        LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
                        .set("bind-address", LOCATOR_IP.get())
                        .create();

                        cache.createDiskStoreFactory()
                        .setMaxOplogSize(500)
                        .setDiskDirsAndSizes(new File[] { new
                        File("/opt/ccio/geode/store") }, new int[] {
                        18000 })
                        .setCompactionThreshold(95)
                        .create("-ccio-store");

                        RegionFactory<String, byte[]> regionFactory =
                        cache.createRegionFactory();

                        Region<String, byte[]> region = regionFactory
                        .setDiskStoreName("-ccio-store")
                        .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
                        .setOffHeap(false)
                        .setMulticastEnabled(false)
                        .setCacheLoader(new AwsS3CacheLoader())
                        .create("ccio-images");

                        I don't understand how the memory is getting
                        overcommitted.

                        Eugene

                        On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby
                        <[email protected]
                        <mailto:[email protected]>> wrote:

                            The OOM killer uses the overcommit_memory
                            and overcommit_ratio parameters to
                            determine if / when to kill a process.

                            What are the settings for these parameters
                            in your environment?

                            The defaults are 0 and 50.

                            cat /proc/sys/vm/overcommit_memory
                            0

                            cat /proc/sys/vm/overcommit_ratio
                            50

                            How much free memory is available before
                            you start the JVM?

                            How much free memory is available when
                            your process is killed?

                            You can monitor free memory using either
                            free or vmstat before and during your test.

                            Run free -m in a loop to monitor free
                            memory like:

                            free -ms2
                               total used free shared  buffers cached
                            Mem:    290639  35021 255617  0       9215
                                 21396
                            -/+ buffers/cache:       4408 286230
                            Swap:    20473    0  20473

                            Run vmstat in a loop to monitor memory like:

                            vmstat -SM 2
                            procs -----------memory----------
                            ---swap-- -----io---- --system-- -----cpu-----

r b swpd free buff cache si sobi bo in cs us sy id wa st

                             0  0  0 255619 9215  21396  0    0     0
                               23    0  0  2  0 98  0  0
                             0  0  0 255619 9215  21396  0    0     0
                                0  121  198  0  0 100  0  0
                             0  0  0 255619 9215  21396  0    0     0
                                0  102  189  0  0 100  0  0
                             0  0  0 255619 9215  21396  0    0     0
                                0  110  195  0  0 100  0  0
                             0  0  0 255619 9215  21396  0    0     0
                                0  117  205  0  0 100  0  0


                            Thanks,
                            Barry Oglesby


                            On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith
                            <[email protected]
                            <mailto:[email protected]>> wrote:

                                The java metaspace will also take up
                                memory. Maybe try setting
                                -XX:MaxMetaspaceSize

                                -Dan


                                -------- Original message --------
                                From: Eugene Strokin
                                <[email protected]
                                <mailto:[email protected]>>
                                Date: 4/22/2016 4:34 PM (GMT-08:00)
                                To: [email protected]
                                <mailto:[email protected]>
                                Subject: Re: System Out of Memory

                                The machine is small, it has only
                                512mb RAM, plus 256mb swap.
                                But java is set max heap size to
                                400mb. I've tried less, no help. And
                                the most interesting part is that I
                                don't see Java OOM Exceptions at all.
                                I even included a code with memory
                                leak, and I saw the Java OOM
                                Exceptions before the java process got
                                killed then.
                                I've browsed internet, and some people
                                are actually noticed the same problem
                                with other frameworks, not Geode. So,
                                I'm suspecting this could be not
                                Geode, but Geode was the first suspect
                                because it has off-heap storage
                                feature. They say that there was a
                                memory leak, but for some reason OS
                                was killing the process even before
                                Java was getting OOM,
                                I'll connect with JProbe, and will be
                                monitoring the system with the
                                console. Will let you know if I'll
                                find something interesting.

                                Thanks,
                                Eugene


                                On Fri, Apr 22, 2016 at 5:55 PM, Dan
                                Smith <[email protected]
                                <mailto:[email protected]>> wrote:

                                    What's your -Xmx for your JVM set
                                    to, and how much memory does your
                                    droplet have? Does it have any
                                    swap space? My guess is you need to
                                    reduce the heap size of your JVM
                                    and the OS is killing your process
                                    because there is not enough memory
                                    left.

                                    -Dan

                                    On Fri, Apr 22, 2016 at 1:55 PM,
                                    Darrel Schneider
                                    <[email protected]
                                    <mailto:[email protected]>> wrote:
                                    > I don't know why your OS would
                                    be killing your process which
                                    seems like your
                                    > main problem.
                                    >
                                    > But I did want you to know that
                                    if you don't have any regions with
                                    > off-heap=true then you have no
                                    reason to have
                                    off-heap-memory-size to be set
                                    > to anything other than 0.
                                    >
                                    > On Fri, Apr 22, 2016 at 12:48
                                    PM, Eugene Strokin
                                    <[email protected]
                                    <mailto:[email protected]>>
                                    > wrote:
                                    >>
                                    >> I'm running load tests on the
                                    Geode cluster I've built.
                                    >> The OS is killing my process
                                    occasionally, complaining that the
                                    process
                                    >> takes too much memory:
                                    >>
                                    >> # dmesg
                                    >> [ 2544.932226
                                    <tel:%5B%202544.932226>] Out of
                                    memory: Kill process 5382 (java)
                                    score 780 or
                                    >> sacrifice child
                                    >> [ 2544.933591
                                    <tel:%5B%202544.933591>] Killed
                                    process 5382 (java)
                                    total-vm:3102804kB,
                                    >> anon-rss:335780kB, file-rss:0kB
                                    >>
                                    >> Java doesn't have any problems,
                                    I don't see OOM exception.
                                    >> Looks like Geode is using
                                    off-heap memory. But I set offHeap
                                    to false for
                                    >> my region, and I do have only
                                    one region:
                                    >>
                                    >> RegionFactory<String, byte[]>
                                    regionFactory =
                                    cache.createRegionFactory();
                                    >> regionFactory
                                    >> .setDiskStoreName("-ccio-store")
                                    >>
                                    
.setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
                                    >> .setOffHeap(false)
                                    >> .setCacheLoader(new
                                    AwsS3CacheLoader());
                                    >>
                                    >> Also, I've played with
                                    off-heap-memory-size setting,
                                    setting it to small
                                    >> number like 20M to prevent
                                    Geode to take too much off-heap
                                    memory, but
                                    >> result is the same.
                                    >>
                                    >> Do you have any other ideas
                                    what could I do here? I'm stack at
                                    this point.
                                    >>
                                    >> Thank you,
                                    >> Eugene
                                    >
                                    >

Re: System Out of Memory

Reply via email to