Hi Sling community, I want to share a recent experience I had with Sling Models, Sling Model caching and Garbage Collection problems.
I had a case, where an AEM instance had massive garbage collection problems, but no memory problems. We saw the regular sawtooth pattern in the heap consumption, but heavy GC activity (in a stop-the-world manner) almost constantly. But no OutOfMemory situation, there it's not a memory leak. I manually captured a heapdump and found a lot of Sling models being referenced by the Sling ModelAdapterFactory cache, and rechecking these model classes in detail I found them to specify "cache=true" in their @Model annotation.When these statements were removed, the situation looks completely different, and the garbage collection was normal again. I don't have a full explanation for this behavior yet. The Sling Models had a reference to a ResourceResolver (which was properly closed), but I assume that this reference somehow "disabled" the cleaning of the cache on major GCs (as its a WeakHashMap), but tied the collection of these models to the collection of the ResourceResolver objects, which have finalizers registered. And finalizers are only executed under memory pressure. Having this connetion might have led to the situation that the SlingModel objects were not disposed eagerly, but only alongside the finalizers in situation of high memory pressure in a "stop-the-world" situation. I try to get some more examples for that behavior; but I am not sure of the caching of Sling Models as-is is something we should continue to use. This case was quite hard to crack, and I don't know if/how we can avoid such a situation by design. Jörg -- https://cqdump.joerghoh.de