[jira] [Commented] (CASSANDRA-8897) Remove FileCacheService, instead pooling the buffers

Benedict (JIRA) Wed, 08 Apr 2015 03:47:41 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485058#comment-14485058
 ]


Benedict commented on CASSANDRA-8897:
-------------------------------------

bq. for page alignment we create a bigger buffer and slice it on an aligned 
buffer, is there a better way to do this?

No, but you can (and should) allocate a large block of buffers so that you only 
have to truncate one unit of alignment for all buffers - say 512K/1Mb chunks, 
from which we slice smaller buffers.

bq. then they get evicted if they get cold.

The problem with the strategy you've taken is that we only evict entire queues, 
meaning we aren't very flexible. We also evict everything if the server is 
quiet for a period. This could lead to an odd situation of, say, an infrequent 
spurt of traffic with an uncommon page size, with a steady drip of queries 
using that size, and then a 0.5s drop in the regular main type of traffic, with 
this main traffic now never getting to cache its buffers. More typically it's 
likely to lead to a random allocation of memory between the pools. There is 
also a race condition that could leak memory.

There are a lot of ways to skin this cat, but my suggestion would be perhaps 
much simpler, since we don't much mind the object allocation of the buffer 
wrapper, just the main body of it. Although we could avoid that too, so here 
are two suggestions:

Simpler:

* Have a shared queue for all buffer sizes, of slabs of some size, which are 
page aligned
* On allocation we increment a count, slice the buffer size we need from the 
current slab, and set the buffer's attachment field to the slab it's from (or, 
have a map from parent buffer to slab)
* On deallocation we decrement the count, and if that's hit zero we recycle the 
slab
* If we want to be smart, we can have valid ranges we can slice from, but I 
don't think that's necessary. One thing we can do, though, is to collect all of 
the buffers we need to service a single read request upfront, so that they all 
have the same lifespan and we don't promote fragmentation. Perhaps as a follow 
up ticket.
* If we exceed our limit, we allocate a buffer of only exactly the size we need 
(and don't bother page aligning)

A little more complex (but not necessarily better):

* Have separate queue for each buffer size/type, still allocate slabs
* Maintain each slab in a globally shared LRU queue, and a local stack
* Serve requests from the top slab on the stack; when it's exhausted, pop it; 
when the slab is fully (or perhaps partially, if the stack is empty) available 
again, push it back onto the top of the stack
* If the stack is empty, and there is available room, allocate a new slab; 
otherwise deallocate the oldest shared slab; if this slab is still in use, 
allocate a buffer of exactly the size we want and non-page-aligned

These are just suggestions; there are lots of possibilities when building a 
cache/pool like this.

bq. at the moment only the compressed RAR uses direct allocation

We should probably switch all readers to use direct. In fact we should probably 
not allocate heap buffers in any situation it isn't absolutely necessary.


> Remove FileCacheService, instead pooling the buffers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-8897
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8897
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>             Fix For: 3.0
>
>
> After CASSANDRA-8893, a RAR will be a very lightweight object and will not 
> need caching, so we can eliminate this cache entirely. Instead we should have 
> a pool of buffers that are page-aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8897) Remove FileCacheService, instead pooling the buffers

Reply via email to