[ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17080314#comment-17080314
 ] 

ZhaoYang commented on CASSANDRA-15229:
--------------------------------------

Discussed with [~stefania] offline, there are two issues with buffer pool:
 * Chunk cache holds a piece of buffer preventing entire chunk from recycling 
for arbitrary period.
 * Even if we recirculate the partially freed chunk, due to different 
allocation sizes, fragmentation will reduce utilization. That's why forked 
version uses uniform allocation size.

The first issue should be solvable and less risky for 4.0.. Here is the 
performance comparison against recirculating partially freed chunk.

Setup: single node 16T - 8GB heap - 250m rows - mixed read 40k qps - write 10k 
qps - with 128 file cache
 [baseline|https://github.com/jasonstack/cassandra/pull/8]: initiate 2 buffer 
pools, one for chunk cache, one for network.
 
[recirculate-partially-freed-chunk|https://github.com/jasonstack/cassandra/pull/11/files]:
 baseline + partially freed chunk recirculation.

baseline: | !15229-hit-rate.png|thumbnail! | !15229-count.png|thumbnail! | 
!15229-size.png|thumbnail! |

recirculation: | !15229-recirculate-hit-rate.png|thumbnail! | 
!15229-recirculate-count.png|thumbnail! | 
!15229-recirculate-size.png|thumbnail! |

QPS:  !15229-recirculate.png|thumbnail! 

With partially freed chunk recirculation, latency is improved and buffer pool 
misses are reduced..

Should we proceed with recirculating partially freed chunk + a separate pool 
for network cache in 4.0 and then port forked buffer pool with uniform 
allocation size in 4.x?

> BufferPool Regression
> ---------------------
>
>                 Key: CASSANDRA-15229
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Caching
>            Reporter: Benedict Elliott Smith
>            Assignee: ZhaoYang
>            Priority: Normal
>             Fix For: 4.0, 4.0-beta
>
>         Attachments: 15229-count.png, 15229-hit-rate.png, 
> 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, 
> 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to