Hi everyone,

We have been using Ignite and Ignite.NET in the recent months in a project.
We currently have six Ignite servers (started with ignite.sh) and a bunch
of thick clients split in two .NET Core application deployed in 30 servers.

We store de-normalized data in the Ignite data grid: one of the .NET Core
applications puts data into the cache and the other application is a gRPC
service that just reads that data to compute a response. The data is split
in a dozen of caches which are created programatically from the application
that writes into the caches.

The caches are PARTITIONED and TRANSACTIONAL and the partitions have two
backups.

It's been working fine so far but we identified that one particular cache
was the most read and to reduce network usage and improve response time of
the gRPC service we decided to use a near cache. That particular cache has
~2300 entries which occupies ~110MB of space and the near cache is
configured with a maxSize=5000 and maxMemorySize=500000000

[image: image.png]

The embedded JVM in the gRPC .NET Core application is started with the
following parameters:
-Xmx=1024
-Xms=1024
-Djava.net.preferIPv4Stack=true
-Xrs
-XX:+AlwaysPreTouch
-XX:+UseG1GC
-XX:+ScavengeBeforeFullGC
-XX:+DisableExplicitGC
-DIGNITE_NO_SHUTDOWN_HOOK=true
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.port=12345

If we don't use the near cache, at every gRPC call the server receives it
executes the following code to get the cache (this works fine):
return _ignite.GetCache<TKey, TValue>(cacheName);

And if we want to use the near cache, that line is changed to:
var nearCacheCfg = new NearCacheConfiguration
{
// Use LRU eviction policy to automatically evict entries whenever it
reaches 100000 in size.
EvictionPolicy = new LruEvictionPolicy
{
MaxSize = 5000, // 5000 elements
MaxMemorySize = 500000000
}
};
return _ignite.GetOrCreateNearCache<TKey, TValue>(cacheName, nearCacheCfg);

But since we added the near cache the application memory usage never
stabilizes: without the near cache the application uses ~2.5GB of RAM in
every server but wen we use the near cache, the application memory usage
never stops growing.

This is the memory usage of one of the servers with the gRPC application.
[image: image.png]

In the graph above, the version with the near cache was deployed on
February the 3rd at 17:00. At 01:30 of Febreary the 4th the server started
swapping and at arround 7:45 the application crashed. This is a detail:
[image: image.png]

I would very much like to create a reproducer but it looks like it would
take a very long time to execute the reproduce the issue as the gRPC
application needs several hours to use all the memory and if we take into
account that every server with the gRPC application receives around 90
requests per second, if the memory leak exists it is very slow.

Does anybody have any idea where the problem can be or how to find it?


Thank you very much

Reply via email to