[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082081#comment-17082081 ]
ZhaoYang commented on CASSANDRA-15229: -------------------------------------- {quote} Recirculating immediately will lead to greater inefficiency in allocation, as we will attempt to reuse partially freed chunks in preference to entirely freed chunks, leading to a great deal more churn in the active blocks. This will affect the networking pooling as much as the chunk cache. {quote} In networking, most of the time, buffer will be release immediately after allocation and with {{recycleWhenFree=false}}, fully freed chunk will be reused instead of being recycled to global list. Partial-recycle is unlikely affect networking usage. I am happy to test it.. {quote} At the very least this behaviour should be enabled only for the ChunkCache, but ideally might have e.g. two queues, one with guaranteed-free chunks, another (perhaps for ease a superset) containing those chunks that might or mightn't be free. {quote} It's a good idea to have a separate queue and let partially freed chunk to have lower priority than fully freed chunk. So partially freed chunks will likely have larger freed space comparing to reusing them immediately. {quote}if using Unsafe.allocateMemory wouldn't be simpler, more efficient, less risky and produce less fragmentation. {quote} It is simpler, but not efficient.. Without slab allocation, will it create fragmentation in system direct memory? I tested with "Bytebuffer#allocateDirect" and "Unsafe#allocateMemory", both latencies are slightly worse than baseline. btw, I think it'd be nice to add a new metrics to track direct bytebuffer allocation outside of buffer pool because they may be held by chunk cache for a long time. Chunk cache with [Bytebuffer.allocateDirect|https://github.com/jasonstack/cassandra/commit/c3f286c1148d13f00364872413733822a4a2c475]: !15229-direct.png|width=600,height=400! Chunk cache with [Unsafe.allocateMemory|https://github.com/jasonstack/cassandra/commit/3dadd884ff0d8e19d3dd46a07a290762755df312]: !15229-unsafe.png|width=600,height=400! > BufferPool Regression > --------------------- > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching > Reporter: Benedict Elliott Smith > Assignee: ZhaoYang > Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, > 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, > 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, > 15229-unsafe.png > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination and per-allocation overhead. We don’t need > 1KiB granularity for allocations, nor 16 byte granularity for tiny > allocations. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org