Example region settings for the region shown in the stack trace:

<gfe:replicated-region id="net.lautus.gls.domain.life.accounting.
AccountingTransaction"
                       disk-store-ref="tauDiskStore" statistics="true"
persistent="true" scope="distributed-no-ack"
                       enable-async-conflation="true"
                       enable-subscription-conflation="true">
 <!--<gfe:cache-listener ref="cacheListener"/>-->
    <gfe:eviction type="HEAP_PERCENTAGE" action="OVERFLOW_TO_DISK"/>
</gfe:replicated-region>



Yourkit reports a frozen thread but I can see the functional is still
executing slowly in the background and busy with the size calculations.
Thread-4 Frozen for at least 10h 49m 58s


+-----------------------------------------------------------
---------------------------------------------------------+
|                                                        Name
                                          |
+-----------------------------------------------------------
---------------------------------------------------------+
|  +---Thread-4 Frozen for at least 10h 49m 58s
                                          |
|    |
                                         |
|    +---sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int)
FileDispatcherImpl.java (native)             |
|    |
                                         |
|    +---sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int)
SocketDispatcher.java:39                        |
|    |
                                         |
|    +---sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer,
long, NativeDispatcher) IOUtil.java:223  |
|    |
                                         |
|    +---sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long,
NativeDispatcher) IOUtil.java:192                  |
|    |
                                         |
|    +---sun.nio.ch.SocketChannelImpl.read(ByteBuffer)
SocketChannelImpl.java:380                                    |
|    |
                                         |
|    +---org.apache.geode.internal.tcp.Connection.runNioReader()
Connection.java:1808                                |
|    |
                                         |
|    +---org.apache.geode.internal.tcp.Connection.run()
Connection.java:1688                                         |
|    |
                                         |
|    +---java.lang.Thread.run() Thread.java:748
                                          |
+-----------------------------------------------------------
---------------------------------------------------------+

Generated by YourKit Java Profiler 2018.04-b81 August 16, 2018 10:52:51 AM

On Thu, Aug 16, 2018 at 10:37 AM, Pieter van Zyl <[email protected]>
wrote:

> Good morning.
>
> We are busy with a prototype to evaluate the use of Geode in our company.
> Now we are trying to go through all our regions to perform some form of
> validations. We are using a function to perform the validation.
>
> While iterating through the regions it seem to slow down dramatically.
>
> The total database has about 98 million objects. We fly through about 24
> million in 1.5 minutes.
>
> Then we hit certain objects in a Region that are large and eveything slows
> down. We then process about 10 000 entries every 1.5 hours.
> We needed to set the server and locator timeouts so that we don't get
> kicked off.
>
> The objects can be quit large.
>
> Using YourKit I can see the following:
>
> ValidationThread0  Runnable CPU usage on sample: 1s
>   it.unimi.dsi.fastutil.objects.ReferenceOpenHashSet.rehash(int)
> ReferenceOpenHashSet.java:578
>   it.unimi.dsi.fastutil.objects.ReferenceOpenHashSet.add(Object)
> ReferenceOpenHashSet.java:279
>   org.apache.geode.internal.size.ObjectTraverser$VisitStack.add(Object,
> Object) ObjectTraverser.java:159
>   org.apache.geode.internal.size.ObjectTraverser.doSearch(Object,
> ObjectTraverser$VisitStack) ObjectTraverser.java:83
>   org.apache.geode.internal.size.ObjectTraverser.breadthFirstSearch(Object,
> ObjectTraverser$Visitor, boolean) ObjectTraverser.java:50
>
> *org.apache.geode.internal.size.ObjectGraphSizer.size(Object,
> ObjectGraphSizer$ObjectFilter, boolean) ObjectGraphSizer.java:98
> org.apache.geode.internal.size.ReflectionObjectSizer.sizeof(Object)
> ReflectionObjectSizer.java:66*
>   org.apache.geode.internal.size.SizeClassOnceObjectSizer.sizeof(Object)
> SizeClassOnceObjectSizer.java:60
>   org.apache.geode.internal.cache.eviction.SizeLRUController.sizeof(Object)
> SizeLRUController.java:68
>   org.apache.geode.internal.cache.eviction.HeapLRUController.entrySize(Object,
> Object) HeapLRUController.java:92
>   org.apache.geode.internal.cache.entries.VersionedStatsDiskLRURegionEnt
> ryHeapLongKey.updateEntrySize(EvictionController, Object)
> VersionedStatsDiskLRURegionEntryHeapLongKey.java:207
>   
> org.apache.geode.internal.cache.VMLRURegionMap.beginChangeValueForm(EvictableEntry,
> CachedDeserializable, Object) VMLRURegionMap.java:178
>   
> org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(Region,
> RegionEntry) VMCachedDeserializable.java:119
>   org.apache.geode.internal.cache.LocalRegion.getDeserialized(RegionEntry,
> boolean, boolean, boolean, boolean) LocalRegion.java:1293
>   
> org.apache.geode.internal.cache.LocalRegion.getDeserializedValue(RegionEntry,
> KeyInfo, boolean, boolean, boolean, EntryEventImpl, boolean, boolean)
> LocalRegion.java:1232
>   
> org.apache.geode.internal.cache.LocalRegionDataView.getDeserializedValue(KeyInfo,
> LocalRegion, boolean, boolean, boolean, EntryEventImpl, boolean, boolean)
> LocalRegionDataView.java:43
>   org.apache.geode.internal.cache.LocalRegion.get(Object, Object,
> boolean, boolean, boolean, ClientProxyMembershipID, EntryEventImpl,
> boolean, boolean, boolean) LocalRegion.java:1384
>   org.apache.geode.internal.cache.LocalRegion.get(Object, Object,
> boolean, boolean, boolean, ClientProxyMembershipID, EntryEventImpl,
> boolean) LocalRegion.java:1334
>   org.apache.geode.internal.cache.LocalRegion.get(Object, Object,
> boolean, EntryEventImpl) LocalRegion.java:1319
>   org.apache.geode.internal.cache.AbstractRegion.get(Object)
> AbstractRegion.java:408
>   org.rdb.geode.session.GeodeDatabaseSessionObject.lazyLoadField(String)
> GeodeDatabaseSessionObject.java:240
>   
> net.lautus.gls.domain.life.accounting.AccountingTransaction.lazyLoadField(String)
> AccountingTransaction.java:1
>   org.rdb.internal.aspect.PersistenceAspect.getField(JoinPoint, Object)
> PersistenceAspect.java:68
>   
> net.lautus.gls.domain.life.accounting.AccountingTransaction.thoroughValidate()
> AccountingTransaction.java:33
>   
> net.lautus.gls.tools.validation.ValidateDomainObjectScript.run(DatabaseSession,
> PersistentDomainObject) ValidateDomainObjectScript.java:36
>   
> net.lautus.gls.tools.validation.ValidateDomainObjectScript.run(DatabaseSession,
> Object) ValidateDomainObjectScript.java:13
>   
> org.rdb.util.validator.internal.geode.GeodeValidationRunnable.validateInstance(Object,
> InstanceScript, DatabaseSession) GeodeValidationRunnable.java:100
>   
> org.rdb.util.validator.internal.geode.GeodeValidationRunnable.operation(TransactionStrategy,
> OrderedObject) GeodeValidationRunnable.java:84
>   
> org.rdb.util.validator.internal.geode.GeodeValidationRunnable.operation(TransactionStrategy,
> Object) GeodeValidationRunnable.java:22
>   org.rdb.util.finder.WorkerRunnable.execute() WorkerRunnable.java:39
>   org.rdb.util.finder.ThreadRunnable.run() ThreadRunnable.java:45
>   java.lang.Thread.run() Thread.java:748
>
> My worry is this logic:
> *ObjectGraphSizer.size*
>
>>
>> Find the size of an object and all objects reachable from it using
>> breadth first search. This
>> method will include objects reachable from static fields
>
>
> We have tried to use the size logic and we found that we have a lot of
> connect graphs/objects and a root object that reported 19gig.
>
> Our objects have a lot of fields.
>
> While our objects do use ID's to other objects for one-to-one and
> one-to-many objects we actually resolve these ID's and build up a tree in
> memory
> Account {
> Bank bank
>
> }
>
> Transform for storage on disk as:
> Account {
> long bankId
>
> }
>
>
> Read from disk:
> Account {
> long bankId
>
> }
>  the on first access transform to:
> Account {
> Bank bank
>
> }
>
> This means that we could build up the whole connected tree in memory.
>
> I know Geode is not a Graph database or Object database and so we might
> not be using it for the correct use case.....maybe that is our fundamental
> problem.
>
> But even so....isn't this size check that is being performed during LRU
> eviction shown in the stack trace a big calculation?
> Is there a possibility to turn it off?
> Is it trying to see all connected objects so that all of them can be
> evicted?
>
> Some information on the environment:
>
> The database size on disk is around 47Gig
>
> The VM has 16 cores and and 102 gig memory
>
> VM settings
>
>     -agentpath:/home/r2d2/yourkit/bin/linux-x86-64/libyjpagent.so
>     -javaagent:lib/aspectj/lib/aspectjweaver.jar
>     -Dgemfire.EXPIRY_THREADS=16
>     -Dgemfire.PREFER_SERIALIZED=false
>     -Dgemfire.enable.network.partition.detection=false
>     -Dgemfire.autopdx.ignoreConstructor=true
>     -Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true
>     -Dgemfire.member-timeout=600000
>     -Xms75g
>     -Xmx75g
>     -XX:+UseConcMarkSweepGC
>     -XX:+UseParNewGC
>     -XX:+CMSParallelRemarkEnabled
>     -XX:+UseCMSInitiatingOccupancyOnly
>     -XX:CMSInitiatingOccupancyFraction=70
>     -XX:+DisableExplicitGC
>     -XX:NewSize=21g
>     -XX:MaxNewSize=21g
>     -XX:+PrintGCDetails
>     -XX:+PrintTenuringDistribution
>     -XX:+PrintGCTimeStamps
>     -XX:+PrintGCApplicationStoppedTime
>     -verbose:gc
>     -Xloggc:/home/r2d2/rdb-geode-server/gc/gc.log
>     -Djava.rmi.server.hostname=localhost
>     -Dcom.sun.management.jmxremote.port=9010
>     -Dcom.sun.management.jmxremote.rmi.port=9010
>     -Dcom.sun.management.jmxremote.local.only=false
>     -Dcom.sun.management.jmxremote.authenticate=false
>     -Dcom.sun.management.jmxremote.ssl=false
>     -XX:+UseGCLogFileRotation
>     -XX:NumberOfGCLogFiles=10
>     -XX:GCLogFileSize=1M
>
> <!-- copy-on-read: https://gemfire.docs.pivotal.i
> o/geode/basic_config/data_entries_custom_classes/managing_
> data_entries.html-->
>     <gfe:cache properties-ref="gemfire-props"
> pdx-serializer-ref="pdxSerializer" pdx-persistent="true"
>                pdx-disk-store="pdx-disk-store"
> eviction-heap-percentage="80" critical-heap-percentage="90"
>                id="gemfireCache" copy-on-read="false"
> enable-auto-reconnect="true">
>
>     </gfe:cache>
>
>
> Kindly
> Pieter
>
>

Reply via email to