[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908521#comment-13908521
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
-------------------------------------------

bq. The basic idea is that we have made Cell and DecoratedKey both interfaces, 
and we have "buffer" and "native" implementations. The native implementations 
squash the implementation of CellName into the same object, so that we can 
avoid any allocation overhead, and so we don't need to allocate a new object 
every time we read the name. 

With you so far.

bq. As a result we have had to go a little anti-OOP; DecoratedKey and *Cell are 
now interfaces, with static implementation "modules", the methods of which are 
invoked by each implementation with themselves as the first parameter.

Not sure I follow, I only see static methods sizeOf and construct in NativeCell 
for instance.

bq. without CASSANDRA-6697 we allocate a lot of ByteBuffers temporarily, i.e. 
whenever we read the constituents of the name or the contents of the cell

My preferred solution would be, stop extracting the name so often by itself.  
Spot checking the code, it seems we usually do this just to "simplify" a 
comparison, so this could in principle just be done with the Cell object rather 
than just the name.


> Slightly More Off-Heap Memtables
> --------------------------------
>
>                 Key: CASSANDRA-6694
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
> the on-heap overhead is still very large. It should not be tremendously 
> difficult to extend these changes so that we allocate entire Cells off-heap, 
> instead of multiple BBs per Cell (with all their associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
> bytes per cell on average for the btree overhead, for a total overhead of 
> around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
> address (we will do alignment tricks like the VM to allow us to address a 
> reasonably large memory space, although this trick is unlikely to last us 
> forever, at which point we will have to bite the bullet and accept a 24-byte 
> per cell overhead), and 4-byte object reference for maintaining our internal 
> list of allocations, which is unfortunately necessary since we cannot safely 
> (and cheaply) walk the object graph we allocate otherwise, which is necessary 
> for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName 
> instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to