I haven't found chunk cache to be particularly useful.  It's a fairly small 
cache that could only help when you're dealing with a small hot dataset.  I 
wouldn't bother increasing memory for it.

Key cache can be helpful, but it depends on the workload.  I generally 
recommend optimizing for your HW first for the case where you don't hit cache.  

Generally, cache is used to make up for issues with bottlenecked I/O.  If you 
haven't already done so, I recommend taking a look at what you're actually 
doing in terms of device I/O (bitehist), compare that to what's being requested 
to your filesystem (ebpf probe + histo on vfs_read) and looking at your page 
cache hit rate with cachestat.  You're likely to find you've got a ton of read 
amplification due to either misconfigured compression or read ahead, both of 
which can saturate your disks and make it appear like you need to give more 
memory to cache.  I always recommend optimizing things for the worst cache (all 
cache misses) then use cache to improve things vs papering over an underlying 
perf issue.

I wrote a bunch about this recently:

https://rustyrazorblade.com/post/2023/2023-11-07-async-profiler/
https://rustyrazorblade.com/post/2023/2023-11-14-bcc-tools/
https://rustyrazorblade.com/post/2023/2023-11-21-bpftrace/

Jon

On 2023/11/27 14:59:55 Sébastien Rebecchi wrote:
> Hello
> 
> When I use nodetool info, it prints that relevant information
> 
> Heap Memory (MB)       : 14229.31 / 32688.00
> Off Heap Memory (MB)   : 5390.57
> Key Cache              : entries 670423, size 100 MiB, capacity 100 MiB,
> 13152259 hits, 47205855 requests, 0.279 recent hit rate, 14400 save period
> in seconds
> Chunk Cache            : entries 63488, size 992 MiB, capacity 992 MiB,
> 143250511 misses, 162302465 requests, 0.117 recent hit rate, 2497.557
> microseconds miss latency
> 
> Here I focus on lines relevant for that conversation. And the numbers are
> roughly the same for all nodes of the cluster.
> The key and chunk caches are full and the hit rate is low. At the same time
> the heap memory is far from being used at full capacity.
> I would say that I can significantly increase the sizes of those caches in
> order to increase hit rate and improve performance.
> In cassandra.yaml, key_cache_size_in_mb has a blank value, so 100 MiB by
> default, and file_cache_size_in_mb is set to 1024.
> I'm thinking about setting key_cache_size_in_mb to 1024
> and file_cache_size_in_mb to 2048. What would you recommend? Is anyone
> having good experience with tuning those parameters?
> 
> Thank you in advance.
> 
> Sébastien.
> 

Reply via email to