Hi Pieter! Just to double-check, do you have any GC issues? How big are your “big” objects? What serialization approach are you using (Java / DataSerializable / PDX)?
Anthony > On Aug 16, 2018, at 7:09 AM, Michael Stolz <[email protected]> wrote: > > One thing to make sure of is that the function is only accessing data that is > local to each of the nodes where it is running. > To do this you must do something like this: > Region<String, String> localPrimaryData = > PartitionRegionHelper.getLocalPrimaryData(exampleRegion); > Then you can iterate over the entries in this local Region. > > -- > Mike Stolz > Principal Engineer, GemFire Product Lead > Mobile: +1-631-835-4771 > Download the GemFire book here. > <https://content.pivotal.io/ebooks/scaling-data-services-with-pivotal-gemfire> > > On Thu, Aug 16, 2018 at 4:37 AM, Pieter van Zyl <[email protected] > <mailto:[email protected]>> wrote: > Good morning. > > We are busy with a prototype to evaluate the use of Geode in our company. > Now we are trying to go through all our regions to perform some form of > validations. We are using a function to perform the validation. > > While iterating through the regions it seem to slow down dramatically. > > The total database has about 98 million objects. We fly through about 24 > million in 1.5 minutes. > > Then we hit certain objects in a Region that are large and eveything slows > down. We then process about 10 000 entries every 1.5 hours. > We needed to set the server and locator timeouts so that we don't get kicked > off. > > The objects can be quit large. > > Using YourKit I can see the following: > > ValidationThread0 Runnable CPU usage on sample: 1s > it.unimi.dsi.fastutil.objects.ReferenceOpenHashSet.rehash(int) > ReferenceOpenHashSet.java:578 > it.unimi.dsi.fastutil.objects.ReferenceOpenHashSet.add(Object) > ReferenceOpenHashSet.java:279 > org.apache.geode.internal.size.ObjectTraverser$VisitStack.add(Object, > Object) ObjectTraverser.java:159 > org.apache.geode.internal.size.ObjectTraverser.doSearch(Object, > ObjectTraverser$VisitStack) ObjectTraverser.java:83 > org.apache.geode.internal.size.ObjectTraverser.breadthFirstSearch(Object, > ObjectTraverser$Visitor, boolean) ObjectTraverser.java:50 > org.apache.geode.internal.size.ObjectGraphSizer.size(Object, > ObjectGraphSizer$ObjectFilter, boolean) ObjectGraphSizer.java:98 > org.apache.geode.internal.size.ReflectionObjectSizer.sizeof(Object) > ReflectionObjectSizer.java:66 > org.apache.geode.internal.size.SizeClassOnceObjectSizer.sizeof(Object) > SizeClassOnceObjectSizer.java:60 > org.apache.geode.internal.cache.eviction.SizeLRUController.sizeof(Object) > SizeLRUController.java:68 > > org.apache.geode.internal.cache.eviction.HeapLRUController.entrySize(Object, > Object) HeapLRUController.java:92 > > org.apache.geode.internal.cache.entries.VersionedStatsDiskLRURegionEntryHeapLongKey.updateEntrySize(EvictionController, > Object) VersionedStatsDiskLRURegionEntryHeapLongKey.java:207 > > org.apache.geode.internal.cache.VMLRURegionMap.beginChangeValueForm(EvictableEntry, > CachedDeserializable, Object) VMLRURegionMap.java:178 > > org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(Region, > RegionEntry) VMCachedDeserializable.java:119 > org.apache.geode.internal.cache.LocalRegion.getDeserialized(RegionEntry, > boolean, boolean, boolean, boolean) LocalRegion.java:1293 > > org.apache.geode.internal.cache.LocalRegion.getDeserializedValue(RegionEntry, > KeyInfo, boolean, boolean, boolean, EntryEventImpl, boolean, boolean) > LocalRegion.java:1232 > > org.apache.geode.internal.cache.LocalRegionDataView.getDeserializedValue(KeyInfo, > LocalRegion, boolean, boolean, boolean, EntryEventImpl, boolean, boolean) > LocalRegionDataView.java:43 > org.apache.geode.internal.cache.LocalRegion.get(Object, Object, boolean, > boolean, boolean, ClientProxyMembershipID, EntryEventImpl, boolean, boolean, > boolean) LocalRegion.java:1384 > org.apache.geode.internal.cache.LocalRegion.get(Object, Object, boolean, > boolean, boolean, ClientProxyMembershipID, EntryEventImpl, boolean) > LocalRegion.java:1334 > org.apache.geode.internal.cache.LocalRegion.get(Object, Object, boolean, > EntryEventImpl) LocalRegion.java:1319 > org.apache.geode.internal.cache.AbstractRegion.get(Object) > AbstractRegion.java:408 > org.rdb.geode.session.GeodeDatabaseSessionObject.lazyLoadField(String) > GeodeDatabaseSessionObject.java:240 > > net.lautus.gls.domain.life.accounting.AccountingTransaction.lazyLoadField(String) > AccountingTransaction.java:1 > org.rdb.internal.aspect.PersistenceAspect.getField(JoinPoint, Object) > PersistenceAspect.java:68 > > net.lautus.gls.domain.life.accounting.AccountingTransaction.thoroughValidate() > AccountingTransaction.java:33 > > net.lautus.gls.tools.validation.ValidateDomainObjectScript.run(DatabaseSession, > PersistentDomainObject) ValidateDomainObjectScript.java:36 > > net.lautus.gls.tools.validation.ValidateDomainObjectScript.run(DatabaseSession, > Object) ValidateDomainObjectScript.java:13 > > org.rdb.util.validator.internal.geode.GeodeValidationRunnable.validateInstance(Object, > InstanceScript, DatabaseSession) GeodeValidationRunnable.java:100 > > org.rdb.util.validator.internal.geode.GeodeValidationRunnable.operation(TransactionStrategy, > OrderedObject) GeodeValidationRunnable.java:84 > > org.rdb.util.validator.internal.geode.GeodeValidationRunnable.operation(TransactionStrategy, > Object) GeodeValidationRunnable.java:22 > org.rdb.util.finder.WorkerRunnable.execute() WorkerRunnable.java:39 > org.rdb.util.finder.ThreadRunnable.run() ThreadRunnable.java:45 > java.lang.Thread.run() Thread.java:748 > > My worry is this logic: > ObjectGraphSizer.size > > Find the size of an object and all objects reachable from it using breadth > first search. This > method will include objects reachable from static fields > > We have tried to use the size logic and we found that we have a lot of > connect graphs/objects and a root object that reported 19gig. > > Our objects have a lot of fields. > > While our objects do use ID's to other objects for one-to-one and one-to-many > objects we actually resolve these ID's and build up a tree in memory > Account { > Bank bank > > } > > Transform for storage on disk as: > Account { > long bankId > > } > > > Read from disk: > Account { > long bankId > > } > the on first access transform to: > Account { > Bank bank > > } > > This means that we could build up the whole connected tree in memory. > > I know Geode is not a Graph database or Object database and so we might not > be using it for the correct use case.....maybe that is our fundamental > problem. > > But even so....isn't this size check that is being performed during LRU > eviction shown in the stack trace a big calculation? > Is there a possibility to turn it off? > Is it trying to see all connected objects so that all of them can be evicted? > > Some information on the environment: > > The database size on disk is around 47Gig > > The VM has 16 cores and and 102 gig memory > > VM settings > > -agentpath:/home/r2d2/yourkit/bin/linux-x86-64/libyjpagent.so > -javaagent:lib/aspectj/lib/aspectjweaver.jar > -Dgemfire.EXPIRY_THREADS=16 > -Dgemfire.PREFER_SERIALIZED=false > -Dgemfire.enable.network.partition.detection=false > -Dgemfire.autopdx.ignoreConstructor=true > -Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true > -Dgemfire.member-timeout=600000 > -Xms75g > -Xmx75g > -XX:+UseConcMarkSweepGC > -XX:+UseParNewGC > -XX:+CMSParallelRemarkEnabled > -XX:+UseCMSInitiatingOccupancyOnly > -XX:CMSInitiatingOccupancyFraction=70 > -XX:+DisableExplicitGC > -XX:NewSize=21g > -XX:MaxNewSize=21g > -XX:+PrintGCDetails > -XX:+PrintTenuringDistribution > -XX:+PrintGCTimeStamps > -XX:+PrintGCApplicationStoppedTime > -verbose:gc > -Xloggc:/home/r2d2/rdb-geode-server/gc/gc.log > -Djava.rmi.server.hostname=localhost > -Dcom.sun.management.jmxremote.port=9010 > -Dcom.sun.management.jmxremote.rmi.port=9010 > -Dcom.sun.management.jmxremote.local.only=false > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -XX:+UseGCLogFileRotation > -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=1M > > <!-- copy-on-read: > https://gemfire.docs.pivotal.io/geode/basic_config/data_entries_custom_classes/managing_data_entries.html-- > > <https://gemfire.docs.pivotal.io/geode/basic_config/data_entries_custom_classes/managing_data_entries.html-->> > <gfe:cache properties-ref="gemfire-props" > pdx-serializer-ref="pdxSerializer" pdx-persistent="true" > pdx-disk-store="pdx-disk-store" eviction-heap-percentage="80" > critical-heap-percentage="90" > id="gemfireCache" copy-on-read="false" > enable-auto-reconnect="true"> > > </gfe:cache> > > > Kindly > Pieter > >
