[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204787#comment-17204787 ]
Aleksey Yeschenko commented on CASSANDRA-15229: ----------------------------------------------- Thanks for fixing the test issues in the past couple commits (and sorry for the delay in review). One thing I'm not a fan of is names of the two pools - permanent and temporary - as neither describe their respective pools. Something along the lines of 'long lived' and 'short-lived' would work better. Or, perhaps, name them after their use cases - 'chunk-cache' and 'networking' pools. Other than that: 1. {{PermanentBufferPool}} - unused class 2. {{Chunk#fullyRecycled}} is never read, only written to 3. {{putUnusedPortion()}} probably shouldn’t update overflow metric, as this will double-count some of the size when it’s {{put()}} back 4. nit: {{else if}} on L807 doesn’t need a pair of braces for the first two conditions > BufferPool Regression > --------------------- > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching > Reporter: Benedict Elliott Smith > Assignee: Zhao Yang > Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, > 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, > 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, > 15229-unsafe.png > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination and per-allocation overhead. We don’t need > 1KiB granularity for allocations, nor 16 byte granularity for tiny > allocations. > ----- > Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When > local pool is full, one of its chunks will be evicted and only put back to > global pool when all buffers in the evicted chunk are released. But due to > chunk cache, buffers can be held for long period of time, preventing evicted > chunk to be recycled even though most of space in the evicted chunk are free. > There two things need to be improved: > 1. Evicted chunk with free space should be recycled to global pool, even if > it's not fully free. It's doable in 4.0. > 2. Reduce fragmentation caused by different buffer size. With #1, partially > freed chunk will be available for allocation, but "holes" in the partially > freed chunk are with different sizes. We should consider allocating fixed > buffer size which is unlikely to fit in 4.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org