>> why is 256MB -- the default value -- sufficient/insufficient > > We don't know. But how do you know that a cache of 10'000 "entries" is > sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB. > The available memory can be divided into different areas, and each > component is given a part of that. Then you look at performance, and see > which component is slow, and you try to find out why. For example, it also > depends on how expensive a cache miss is.
Yes, I agree... I was thinking more around this and realized that what I was missing while debugging our issue wasn't configuration freedom (in terms of entries) rather an easier way to analyze. Cache hits/misses felt a bit inadequate -- but that can be because we were hitting an entirely different issue and cache hit/miss numbers weren't an issue for us. For us, increasing cache size improved performance which was the red herring -- the performance soon dropped back and we were left wondering if 2GB (that's the size we went for) isn't cutting it. We wondered if there is a way to empirically figure out if we are shooting blank arrows or not. So, yes, I agree that may be configuration in terms of entries isn't very useful (from a outside oak world point of view... internal reason might still apply)... but, I also feel that we are lacking some tooling (I can't really comment what would be useful or not) to investigate. I feel hit/miss numbers aren't apparent enough. > As for the cache size in amout of memory: the best way to know what a good > number is, is to analyze the performance (how much time is spent reading, > cache hit ratio,...) We were running simple page loads -- some of them as trivial as single page rendering based on one resource (and lots of js,css + page rendering scripts + page and component nodes). We often found that first rendering took significantly more time and subsequent reads improved -- this is what directed us towards suspecting cache being the issue. We did reset cache stats and soon cache hit reached around 82%-85% (while on a local dev setup it hung above 98). We tried checking out what documents exists in the cache (using script console and a small groovy script) and there were lots which I wasn't touching directly (e.g. index etc)... and I didn't know if they were even relevant for our page rendering or not. I agree I could really start analyzing subset of cache entries but there were lots (around 31k afair) and I just didn't have enough energy to delve further. So, it might that I could debug the issue further to confirm that cache wasn't the issue -- but on the other hand, I felt that there should be better ways to understand what's going on. >> what should the course of action when seeing a lot of cache misses: (a) >>notify application team, or (b) increase cache size. > > It depends on the reason for the cache misses. There could be a loop over > many nodes somewhere, in which case a larger cache might not really help > (most caches are not scan resistant). There could be other reasons. But I > don't see how the ability to configure the number of entries in the cache > would help. I agree that I don't have enough understanding of internal of how caches work -- but, as I mentioned above, I couldn't quite figure out a clean way to identify cache access pattern -- the best I could figure out was what all existed there. Thanks, Vikas