[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964564#comment-13964564
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:
--------------------------------------------

bq. The JVM instruction set is besides the point. The point is what hotspot 
will do: with a single implementor or static method of small enough bytecode 
representation, it will be inlined. Note I said "multiple implementation" 
virtual method. With the option you suggest we will need an extra virtual 
invocation cost with every access to the underlying bytes, some extra math to 
access the right location, and one extra object field reference to locate the 
position we're offsetting from. These costs mount up rapidly.

How is that besides the point when you claim that method calls with multiple 
implementations are slower than (and not getting inlined) static method 
invocations from multiple classes basically constant_pool reimplementation in 
your code?... What I claim is that it doesn't matter if you override a method 
multiple times or call a static method which calls another static method like 
your patch does for DeletedCell e.g. \{Native, 
Buffer\}DeletedCell.cellDataSize() which calls 
DeletedCell.Impl.cellDataSize(this) which transfers to 
Cell.Impl.cellDataSize(this); Just make an example disassemble classes (with 
javap -c or similar) and see what bytecode did it generate. Also for inlining 
problem I would like to see the proof of reason why are those methods are not 
getting inlined (are they even touched by JIT?) by enabling logging with 
-XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining and 
sharing the output, otherwise "multiple implementation" virtual method being 
slow claim is just empty rhetoric.

bq. Hmm. No, I now note your "client" implementation: what exactly is this one? 
Please clarify, as the thrift cell is going to need to be compared with the 
other implementations, and suddenly much of any benefit will disappear. The 
best way to make comparisons cheap and easy is to have both sides of the 
comparison have at least the same layout. If we have to either virtual invoke 
or instanceof check for every comparison, and a different code path for 
comparing each type of representation, there will be a performance impact. As 
such the only main benefit of this approach is eliminated in my eyes. Also, how 
will this "client" implementation achieve its various functions, and define its 
type? Seems like you'll need a duplicate hierarchy still.

What was just a suggestion for temp container in between client transport and 
memtable, as those buffers are already allocated separately by thrift it seems 
reasonable to have Cell work with those buffers, it would take more memory for 
ByteBuffer containers passed from Thrift but cell comparison logic should not 
change because as they would operate on the common container type, it's similar 
contept to what Netty does with ByteBuf gathered from other ByteBuf pieces.




> Slightly More Off-Heap Memtables
> --------------------------------
>
>                 Key: CASSANDRA-6694
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
> the on-heap overhead is still very large. It should not be tremendously 
> difficult to extend these changes so that we allocate entire Cells off-heap, 
> instead of multiple BBs per Cell (with all their associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
> bytes per cell on average for the btree overhead, for a total overhead of 
> around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
> address (we will do alignment tricks like the VM to allow us to address a 
> reasonably large memory space, although this trick is unlikely to last us 
> forever, at which point we will have to bite the bullet and accept a 24-byte 
> per cell overhead), and 4-byte object reference for maintaining our internal 
> list of allocations, which is unfortunately necessary since we cannot safely 
> (and cheaply) walk the object graph we allocate otherwise, which is necessary 
> for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName 
> instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to