>> why is 256MB -- the default value -- sufficient/insufficient
>
> We don't know. But how do you know that a cache of 10'000 "entries" is
> sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB.
> The available memory can be divided into different areas, and each
> component is given a part of that. Then you look at performance, and see
> which component is slow, and you try to find out why. For example, it also
> depends on how expensive a cache miss is.

Yes, I agree... I was thinking more around this and realized that what
I was missing while debugging our issue wasn't configuration freedom
(in terms of entries) rather an easier way to analyze. Cache
hits/misses felt a bit inadequate -- but that can be because we were
hitting an entirely different issue and cache hit/miss numbers weren't
an issue for us. For us, increasing cache size improved performance
which was the red herring -- the performance soon dropped back and we
were left wondering if 2GB (that's the size we went for) isn't cutting
it. We wondered if there is a way to empirically figure out if we are
shooting blank arrows or not.
So, yes, I agree that may be configuration in terms of entries isn't
very useful (from a outside oak world point of view... internal reason
might still apply)... but, I also feel that we are lacking some
tooling (I can't really comment what would be useful or not) to
investigate. I feel hit/miss numbers aren't apparent enough.

> As for the cache size in amout of memory: the best way to know what a good
> number is, is to analyze the performance (how much time is spent reading,
> cache hit ratio,...)
We were running simple page loads -- some of them as trivial as single
page rendering based on one resource (and lots of js,css + page
rendering scripts + page and component nodes). We often found that
first rendering took significantly more time and subsequent reads
improved -- this is what directed us towards suspecting cache being
the issue. We did reset cache stats and soon cache hit reached around
82%-85% (while on a local dev setup it hung above 98). We tried
checking out what documents exists in the cache (using script console
and a small groovy script) and there were lots which I wasn't touching
directly (e.g. index etc)... and I didn't know if they were even
relevant for our page rendering or not. I agree I could really start
analyzing subset of cache entries but there were lots (around 31k
afair) and I just didn't have enough energy to delve further.
So, it might that I could debug the issue further to confirm that
cache wasn't the issue -- but on the other hand, I felt that there
should be better ways to understand what's going on.

>> what should the course of action when seeing a lot of cache misses: (a)
>>notify application team, or (b) increase cache size.
>
> It depends on the reason for the cache misses. There could be a loop over
> many nodes somewhere, in which case a larger cache might not really help
> (most caches are not scan resistant). There could be other reasons. But I
> don't see how the ability to configure the number of entries in the cache
> would help.
I agree that I don't have enough understanding of internal of how
caches work -- but, as I mentioned above, I couldn't quite figure out
a clean way to identify cache access pattern -- the best I could
figure out was what all existed there.

Thanks,
Vikas

Reply via email to