[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-05-01 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986642#comment-13986642
 ] 

Michael Shuler commented on CASSANDRA-6694:
---

Comparing the default 2.1 dtest runs with offheap dtest runs shows the same 
results:
http://cassci.datastax.com/job/cassandra-2.1_dtest/132/
http://cassci.datastax.com/job/cassandra-2.1_offheap_dtest/4/

The 2.1 unit tests got a default config with this commit for 
memtable_allocation_type: offheap_objects and they appear pretty stable. 
(some new unit test errors have come up since CASSANDRA-6855, but no big 
regression appears to have come from this commit)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 6694.fix1.txt


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984175#comment-13984175
 ] 

Benedict commented on CASSANDRA-6694:
-

I've pushed another slightly tweaked 
[branch|https://github.com/belliottsmith/cassandra/tree/6694-final2]. This is 
taken from Pavel's branch, except I fixed two of the equals() tidies (they were 
missing outer parentheses) and reintroduced the _use_ of the MemtableAllocator 
in CFS.FlushLargestMemtable and CFS.logFlush for indexes, but only if there is 
an underlying CFS to the index (so no API change)

I've also rebased to latest 2.1. 

[~slebresne] if you're happy with AbstractNativeCell can you commit when you 
get a chance? Thanks

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984208#comment-13984208
 ] 

Benedict commented on CASSANDRA-6694:
-

Pushed one last change, based on comments offline by Sylvain, to construct the 
full text representation of the Identifier when we can't find it in CFMetaData.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983417#comment-13983417
 ] 

Benedict commented on CASSANDRA-6694:
-

I've pushed a completed branch 
[here|https://github.com/belliottsmith/cassandra/tree/6694-reorg2]

I've taken to completion your flattening of the PoolAllocator and DataAllocator 
hierarchies, implemented DecoratedKey, reintroduced the extra unit tests, fixed 
some bugs with the Cell hierarchy, slightly rejigged the data layout for native 
cell to simplify offset calculation and fixed a performance regression and the 
message digest optimisation.

The only thing I haven't done is the refactors I would like to perform before 
we finally commit this, so as to make review easier for others.

Note I'm still running dtests and doing some final vetting, but I wanted to 
post this message now as I reckon this version is most likely ready and this is 
somewhat time critical, and because I want to avoid any duplicated effort in 
getting a final patch together.

I think I've addressed your concern's [~iamaleksey], however with the following 
notes:

bq. getAllocator() doesn’t belong to SecondaryIndex, API-wise. CFS#logFlush() 
and CFS.FLCF#run() should just use 
SecondaryIndexManager#getIndexesNotBackedByCfs() and get their allocators 
directly instead of using SIM#getIndexes() and checking for null.

This was a conscious decision to permit custom 2i use our allocators and count 
towards book keeping for memory utilisation.

bq. Composite/CellName/CellNameType/etc#copy() all now have an extra CFMetaData 
argument, while only NativeCell really uses it. Can we isolate its usage to a 
NativeCell-specific methods and leave the rest alone?

Not sure how we do that when either can be present when you want to perform 
these calls. Possible I'm missing something obvious though, so please do let me 
know :)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983425#comment-13983425
 ] 

Benedict commented on CASSANDRA-6694:
-

Oh, also, [~iamaleksey]: Your assertion about super columns and sparse 
composites appears to be broken by the CliTest somehow. I haven't investigated, 
but this is why I introduced that branch. I've stripped out the condition and 
just always take that branch if we fail to lookup in cfMetaData, to deal with 
names being dropped whilst we aren't expecting, so it no longer assumes this 
but also copes with it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983666#comment-13983666
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

bq. This was a conscious decision to permit custom 2i use our allocators and 
count towards book keeping for memory utilisation.

I feel like you are lacking context here, wrt custom 2i actually are and what 
implementations we have. The ones that exist don't need it, so IMO this was a 
wrong decision, even if conscious. Have a look at DSE's Solr implementation if 
curious.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983670#comment-13983670
 ] 

Benedict commented on CASSANDRA-6694:
-

I don't mind dropping it, but it seems a harmless addition for users who 
implement their own buffered writes for secondary indexes, so that they can 
consider the amount of data they are using for their own state when deciding 
which CFs to flush. The fact that DSE doesn't do this doesn't mean it isn't 
useful.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983675#comment-13983675
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

Sure, but I prefer to add stuff when it's clear that there is someone who 
actually needs it, and not some hypothetical user that doesn't exist.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983708#comment-13983708
 ] 

Benedict commented on CASSANDRA-6694:
-

To make shipping easier, I've pushed a rebased and squashed branch 
[here|https://github.com/belliottsmith/cassandra/tree/6694-reorg2-rebase]

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983709#comment-13983709
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Here is a [new branch|https://github.com/xedin/cassandra/compare/6694-final] 
with all of the changes from 6694-reorg2 squashed and added a couple of commits 
to cleanup and remove secondary index getAllocator which is unnecessary right 
now. I was about to push similar refactoring for memtable pools to my branch, 
which made review much faster :)

I'm +1 on combination of the squashed changes and cleanup which is in my 
branch, still not sure about CellName implementation in AbstractNativeCell tho, 
that is not my realm, so it would be nice if Sylvain (or somebody as close to 
that code, if anybody) could take a look...

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-25 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980847#comment-13980847
 ] 

Benedict commented on CASSANDRA-6694:
-

On the whole it looks good, but I have the following comments/concerns:

# DecoratedKey still isn't implemented (should be a relatively minor addition)
# The performance regression for MessageDigest updating is still there
# AbstractCell.localCopy(..MemtableAllocator) needs to be overridden; as it is 
you'll always get a regular Cell back
# You're still using static method implementations, it looks like? Cell.diff 
and Cell.reconcile
# I'm not a fan of mixing the util.memory hierarchy with knowledge of the 
memtable hierarchy. If we plan on this, I'd much prefer to move the whole lot 
into e.g. db.memtable; this might make most sense anyway
# I'd like to move the Cell implementations out of db into something (e.g. 
.memtable) as it's very crowded in there, and they're a dozen or so related 
classes that are easily extracted
# Given how many different kinds of allocator we now have (including 
IAllocator), I'd really like to rename AbstractAllocator to something more 
descriptive like ByteBufferAllocator

Still need to verify all of the changes within the Cells, as comparison is 
currently tricky due to different hierarchy confusing git, but on the whole 
this branch is good if we address these concerns.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-25 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981434#comment-13981434
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


1-2, 5-7 I will address once the main functionality is settled.

bq. AbstractCell.localCopy(..MemtableAllocator) needs to be overridden; as it 
is you'll always get a regular Cell back

Good catch, I forgot to change that before I pushed. Have amended it to the 
original allocator commit and force pushed to my branch, so it's available.

 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-24 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980585#comment-13980585
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I have pushed allocation pools and minor refactoring to [my 
branch|https://github.com/xedin/cassandra/compare/CASSANDRA-6694], also 
addressed some of the problems from [~iamaleksey]'s comment expect concerns 
about CellName implementation for AbstractNativeCell which are mutual. 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-22 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1394#comment-1394
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

h3. Benedict’s original branch

ABTC.ColumnUpdater#apply() calls update.reconcile(existing) and skips 
localCopy() if reconciled == existing. This means that we should optimise all 
reconcile() implementations to prioritise the argument cell in case of ties for 
optimal savings (and ties happen often enough, from retries and whatnot + 
potentially counter updates if we decide to do that thing when batch commit log 
is enabled). Currently we do the opposite. Would be easiest to simply swap the 
call to existing.reconcile(update).

getAllocator() doesn’t belong to SecondaryIndex, API-wise. CFS#logFlush() and 
CFS.FLCF#run() should just use SecondaryIndexManager#getIndexesNotBackedByCfs() 
and get their allocators directly instead of using SIM#getIndexes() and 
checking for null.

Composite/CellName/CellNameType/etc#copy() all now have an extra CFMetaData 
argument, while  only NativeCell really uses it. Can we isolate its usage to a 
NativeCell-specific methods and leave the rest alone?

At least NativeCell#cql3ColumnName() can throw NPE when calling 
metadata.getColumnDefinition(buffer).name. Just because it’s SIMPLE_SPARSE 
doesn’t mean all the column names are predefined - it’s legal to insert 
non-predefined cells w/ default_validator validator via Thrift/CQL2.

NativeCell#copy(), COMPOUND_SPARSE branch - there is no way a compound sparse 
comparator and cfType = Super can coexist. Supers are all compound dense.

Generally, NativeCell methods seem to assume a bit too much about the sizes and 
about what can and what can’t be present/absent. You can even guarantee  
presence of a ColumnIdentifier for COMPOUND_SPARSE, and yet NativeCell#copy() 
would throw an AssertionError is that’s the case. And CFMetaData is mutable, 
too, and it is possible to remove a column via ALTER TABLE at any time.

I’m not comfortable +1-ing it until Sylvain has a look at at least these bits 
(just the NativeCell methods).

Allocator hierarchy is confusing - I won’t claim having understood it entirely, 
as are the names there. ‘Data’ prefix in DataAllocator is absolutely 
meaningless in the context. Maybe MemtableAllocator would be more meaningful? 
Don’t have suggestions for the rest of the names and for making that hierarchy 
more straightforward, but I can live with it as it is.

I very much dislike the Impl thing though. This is an uncomfortable step back 
in Cell* hierarchy readability. Basic things like using IDEA’s Find Usages on 
Cell.Impl#localCopy() not showing Counter/Expiring/Deleted counterparts’ usage 
are annoying. This is my largest, and, really the only fundamental issue with 
the branch. Other than that, and too many assumptions in certain NativeCell 
methods, I’m okay with the branch.

Overall it looks reasonable, and is actually less invasive than I was afraid it 
would be.

Nits: AbstractMemory formatting is all messed up.

h3. Pavel’s refactoring branch

Doesn’t build (although trivial-ish to make it build) and is incomplete (as 
expected), and that does complicate judging the ugliness of the result.

Same issues and potential issues in AbstractNativeCells methods as in 
NativeCell methods in the other branch.

Can’t form an opinion on Pavel’s Allocator/Pool approach, because it’s not here 
yet, and I’m not sure I got it right from just reading the comments.

This *Cell hierarchy, though, I feel a lot more comfortable with.

I feel strongly that we should borrow the Impl-less Cell hierarchy from this 
branch, if nothing else (and there isn’t much else yet) - this is my biggest 
issue with the original.

As for the rest of it - the time is running low, we have to ship 2.1 
eventually. Any chance you could flesh it out in the next few days, maybe until 
Monday, Pavel? If not, I’m not sure if we should block beta2 further :\

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-21 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975540#comment-13975540
 ] 

Benedict commented on CASSANDRA-6694:
-

[~iamaleksey] have you had a chance to take a look and form an opinion? I'm 
happy to proceed with either approach, but we want to get a move on with one or 
the other if we intend to include this in 2.1.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-21 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975677#comment-13975677
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

[~benedict] It's on the top of my TODO list - so I'm looking at your branch 
now. Haven't looked at Pavel's yet. Need a few days, unless something more 
urgent distracts me (which is very unlikely).

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972524#comment-13972524
 ] 

Benedict commented on CASSANDRA-6694:
-

So, on the whole I really don't perceive this approach as better: there's a 
great deal of code duplication now (set to get worse still when you finish the 
refactor for DecoratedKey), between each of the correspondingly named cell 
implementations. Personally I think the Impl approach is neater as a result of 
avoiding that (this may be more pronounced if we decide to optimise equals() is 
you suggested). That said, if this moves us forwards I can live with it, if you 
can address point 1 below.

There are a few problems though:

# I am *very* opposed to a public setPeer() method. This is a deal breaker for 
me - but it can be avoided with a bit more refactoring.
# Your optimised updateDigest function is actually much slower than the old 
implementation for all but the smallest values: an optimised version needs to 
batch the contents into an array (stored in a ThreadLocal) and call 
updateDigest with the array, unless the total size is very small (there's a 
crossover point on my laptop of about 12 bytes, under which it's faster to call 
update(byte)).
# AbstractNativeCell.getBytes actually calls setBytes
# excessHeapSize... should be unsharedHeapSize...
# There should be no hashCode method in Buffer\*Cell - I removed these for a 
reason. Because we can have a Cell that is a CellName, and vice-versa, using a 
Cell as a key for a map is likely dangerous. Since we don't do it anywhere, 
it's safe to simply remove the methods.

There may be other minor issues, I'll hold off giving it a formal review until 
we decide the direction we're going. To respond to a few of your comments:

bq. CounterUpdateCell interface is missing as well as NativeCounterUpdateCell 
implementation to match it.

There shouldn't be one for the time being - we can never construct one.

bq. CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell

Same reason - it doesn't exist as either or, so I made a conscious decision to 
leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind of 
unimportant. It's purpose is somewhat different, and I think it is better left 
named CounterUpdateCell, as that is its purpose (to carry a counter update as 
far as the memtable, and no further).

bq. Impl classes extends another Impl classes which doesn't make much sense as 
all of the methods are static.

This brings in the namespace of the extended class' static methods, which is 
useful.

bq. When taken out of context like that it doesn't really make sense but what I 
meant, there are situation where we don't really need to get BB from the 
CellName but can transfer bytes directly (especially for the native cell 
implementations).

Sure, but again: scope of ticket, and care needs to be taken when doing this 
(e.g. your updateDigest modifications)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972776#comment-13972776
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


To address all of your comments this is not intended for any kind of review 
yet, it is just an idea demonstration that's why I basically carried over all 
of the methods from original implementations, didn't rename or move stuff. Also 
I'm fine if methods in both implementations are going to return constant values 
like serializationFlags or isMarkedForDeleted, a part from that there is not 
much of the code duplication, duplication is also going to be minimized when 
hashCode and other methods go away, which would probably only leave us with 
dataSize and serializedSize duplication but I guess we can come up with 
something clever for native cells there too. Regarding the point about 
updateDigest - it's meant more like representation of kind of things we can do 
if we have two different implementations of it, not optimized for performance 
yet.

bq. There shouldn't be one for the time being - we can never construct one.

and 

bq. Same reason - it doesn't exist as either or, so I made a conscious decision 
to leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind 
of unimportant. It's purpose is somewhat different, and I think it is better 
left named CounterUpdateCell, as that is its purpose (to carry a counter update 
as far as the memtable, and no further).

It is constructed in ColumnFamily and ColumnSerializer. If it's supposed to be 
only one implementation for now let's name it appropriately and use like all 
other buffered cells.

bq. This brings in the namespace of the extended class' static methods, which 
is useful.

By why do we care and what does it give us as those interfaces are called 
directly and static methods don't override each other?

bq. Sure, but again: scope of ticket, and care needs to be taken when doing 
this (e.g. your updateDigest modifications)

I don't really follow what are you implying with that, the scope is introduce 
native implementations as optimized as possible so why do we miss out of such 
low hanging fruit?...



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972796#comment-13972796
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. the scope is introduce native implementations -as optimized as possible-

Otherwise we need to do a lot more than the changes you are suggesting :)

bq.  Also I'm fine if methods in both implementations are going to return 
constant values like serializationFlags or isMarkedForDeleted

Well, these are still duplication - it is not clear as a result where the 
definition of these behaviours live. If the semantics change in future, it may 
introduce errors unnecessarily. Either way equals(),  reconcile() and 
validateFields() will still be issues. You don't seem to have implemented most 
of these methods yet (looks like your code doesn't actually compile). These 
methods are each non-trivial amounts of code duplication, equals() especially 
so is we optimise it as you want to. CounterCell.diff() will also need to be 
duplicated.

But, like I said, I can probably live with all of this if we address the 
setPeer() issue. equals() should probably still end up in a shared static 
method, at the very least, though.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972832#comment-13972832
 ] 

Marcus Eriksson commented on CASSANDRA-6694:


I'm +1 on [~benedict]s branch (have not looked at the one by [~xedin] yet)

nits;
* A few methods in Cell.Impl look redundant, isMarkedForDelete/isLive for 
example, kept around for symmetry?
* License header in DeletedCell and ExpiringCell
* Javadoc comment in NativeAllocator looks wrong

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973220#comment-13973220
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


bq. Well, these are still duplication - it is not clear as a result where the 
definition of these behaviours live. If the semantics change in future, it may 
introduce errors unnecessarily. Either way equals(), reconcile() and 
validateFields() will still be issues. You don't seem to have implemented most 
of these methods yet (looks like your code doesn't actually compile). These 
methods are each non-trivial amounts of code duplication, equals() especially 
so is we optimise it as you want to. CounterCell.diff() will also need to be 
duplicated.

Most of the duplicated methods are methods with static behavior which is not 
going to change e.g. isMarkedForDelete, getMarkedForDeleteAt or 
serializationFlags. CounterCell.diff and reconcile are living in the interface 
for now. I will address setPeer(long) problem and hashCode.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973226#comment-13973226
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. CounterCell.diff and reconcile are living in the interface for now

Ah. This is a Java 8 only feature, which is why I missed it. Not really 
feasible.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973297#comment-13973297
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I'm not talking about default methods in interfaces, I'm just saying that I 
added static diff/reconcile to CounterCell for now :)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973395#comment-13973395
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

bq. It's purpose is somewhat different, and I think it is better left named 
CounterUpdateCell, as that is its purpose (to carry a counter update as far as 
the memtable, and no further).

FWIW it doesn't even make it to a memtable in 2.1, ever. That said, not calling 
it BufferCounterUpdateCell would be bothering my consistency OCD, a lot, and 
I'm not done with counters until 3.0. Can you make my OCD a tiny favor and call 
it consistently with the other implementations? (: Thanks.

bq. There should be no hashCode method in Buffer*Cell - I removed these for a 
reason. Because we can have a Cell that is a CellName, and vice-versa, using a 
Cell as a key for a map is likely dangerous. Since we don't do it anywhere, 
it's safe to simply remove the methods.

Maybe we should just throw UnsupportedOperationException then, but leave the 
methods? I agree that using Cell-s as keys is very unlikely, but stuff like 
this has bitten us before.

Haven't read either branch yet, but planning to soon, just wanted to jump at 
the opportunity to bikeshed a bit.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973405#comment-13973405
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. Can you make my OCD a tiny favor and call it consistently with the other 
implementations? (: Thanks.

Sure. I have a preference to keep it that way, but not a strong one.

bq. Maybe we should just throw UnsupportedOperationException then, but leave 
the methods? I agree that using Cell-s as keys is very unlikely, but stuff like 
this has bitten us before.

Also sure.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973435#comment-13973435
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Regarding, the hashCode that's what we do, I do it in AbstractCell now, 
Benedict does it in both BufferCell and NativeCell.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973534#comment-13973534
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Ok, hashCode and setPeer changes are now pushed to the same branch, 
AbstractNativeCell is independent of NativeAllocation now because 
NativeAllocator returns aligned peer directly, which allows peer field to be 
made final in AbstractNativeCell. Also I have pushed set/get logic for data 
size associated with the pointer to the NativeAllocator as it's basically it's 
metadata, IMO it's a bit cleaner comparing to how that is done in Benedict's 
branch where NativeAllocation tracks pointer alignment to size (internalPeer() 
{ return peer + 4; }) but NativeAllocator takes care of allocating 4 additional 
bytes to requested size.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973538#comment-13973538
 ] 

Benedict commented on CASSANDRA-6694:
-

I don't think this is the right approach: with the changes we are making, we 
are pretty much precluding doing anything fancy with GC (we'll have to rely on 
malloc for now). As such the size is no longer providing any useful book 
keeping information to the NativeAllocator. It should be dealt with entirely in 
the AbstractNativeCell - its concept of size is entirely unique to it for now. 
This also, separately, makes packing structs of NativeCell a lot more straight 
forward.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973571#comment-13973571
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I just don't like that in NativeAllocation we assume that NativeAllocator has 
reserved 4 bytes for us. So I decided to put everything into NativeAllocator 
and only return useful space so we don't have to + 4 every time we need a peer. 
It could be done in AbstractNativeCell which would allocate size + 4 or it 
could be done in NativeAllocator and it would tell how big allocation was based 
on the area pointer that it returned (which is was 
NativeAllocator.getDataSize(areaPointer) does) on demand, either of those 
places (AbstractNativeCell or NativeAllocator) works for me.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973582#comment-13973582
 ] 

Benedict commented on CASSANDRA-6694:
-

The only reason it was happening in NativeAllocator was to support moving the 
peer around (so you need to know how much memory you're copying). 

NativeAllocation assuming it has (i.e. _being defined as having_) a size prefix 
is fine when it is tightly coupled with NativeAllocator (like it is in my 
branch) - but once you have it as a final field in another object, 
NativeAllocator should simply have no say in the matter. It never needs to know 
the size of the allocation, so we should just redefine what our 
AbstractNativeCell considers to be its size in its sizeOf() calculation.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973595#comment-13973595
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Sure, if you like that better I will change that right away, anyhow if we need 
it in allocator for some reason we can change it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973618#comment-13973618
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Done, I have force pushed to my branch, now AbstractNativeCell is handling 
size, NativeAllocator has nothing to do with it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973633#comment-13973633
 ] 

Benedict commented on CASSANDRA-6694:
-

Thanks. Although it looks like you haven't updated any of the offsets to work 
with the new layout?

As to the other changes you've made: I do not like the pollution of 
PoolAllocator with supportsNative(). Since this branch is supposed to be 
pushing idiomatic Java usage, let's stick to using interfaces for 
specialisation since we can.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973645#comment-13973645
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Why it does - internalPeer does + 4 and internalSize does - 4 when all get/set 
methods use internalPeer() + offset. Regarding (and I was waiting for that) 
supportsNative() and allocateNative - I did that because I don't want to put 
time into adding DataAllocator and DataPool interfaces that your code has just 
yet, once it's decided which way we want to go I will remove allocateNative and 
do proper work there. This still intended as just an idea presentation for how 
to handle Cell without Impl classes.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973647#comment-13973647
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. This still intended as just an idea presentation for how to handle Cell 
without Impl classes.

OK, cool. Glad we're staying on topic :)

bq. Why it does - internalPeer does + 4 and internalSize does - 4

My mistake. I was expecting to see the static OFFSET fields updated - we should 
probably optimise that before we finish up (now that we can), but obviously 
fine for now.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-16 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970487#comment-13970487
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Also it seems like for some of the methods e.g. updateDigest, delta, dataSize, 
diff, reconcile, hashCode etc. it would be much better to have native 
implementations which work with underlying bytes directly from day one. Some of 
them, for example, use value().remaining(), value().compareTo(), 
value().duplicate(), or name.toByteBuffer() convert data from one 
representation to another for no real reason, so we can actually end up 
generating a lot more temporary objects then we anticipate. There is another 
concern related to value() method which converts pointer to DirectBuffer, the 
problem is that (at least in OpenJDK and I think Oracle done the same) 
initialization of that class is synchronized and creates PhantomReference, 
which with most collectors only be purged by Full GC.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970603#comment-13970603
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. for now we are allocating BufferCounterCell which allows as to use 
CounterCell.Impl.reconcile for both implementations

We only allocate a new object in the case that the reconcile result isn't one 
or the other of the original inputs. This object is only incredibly short 
lived, and we decided it was easier than passing through the allocator for 
reconcile. This may be slightly worse than we'd like as a result of the 
different cellname layouts, but that can be smoothed over with time. It's 
cleaner in the Memtable +ABC code to keep the pool allocation separate from 
this reconcile, and it's small fry compared to the other stuff we're doing on 
write of counters.

bq. it would be much better to have native implementations which work with 
underlying bytes directly from day one

Agreed that some of these would be nice, however we rarely (if ever) call 
these, and as per your comments wrt zero-copy (CASSANDRA-6842), if we aren't 
worried about copying the contents, we shouldn't be worried about allocating 
temporary objects. Now, there are some methods I would say would be nice to 
have native implementations of sooner than later (e.g. updateDigest), but I 
don't think they're by any means _essential_. What is going to be *far* more 
impactful is CASSANDRA-6755, as this has a reasonably large negative impact on 
name lookups (and to a lesser degree slicing) from a memtable record.

That said, some of them would be quite easy to implement. So I'm not totally 
opposed to delivering them from day 1, I just wanted this patch set to be 
clearly readable and well contained. It's pretty big as it is. I think it would 
be nice to put any of these optimisations in a second ticket. 

If you're suggesting we drop the Impl hierarchy entirely from this patchset and 
just duplicate the methods and optimise, I can maybe get behind that. However 
optimising reconcile() and equals() gets ugly quickly if you want to be able to 
deal with either side of the equation being one or the other (often we'll 
reconcile different kinds, but not always). So we still need a shared 
implementation, but one that is capable of detecting the kind of Cell on each 
side, and selects the correct version of the method. Leaving very few methods 
we'll be optimising in Native*Cell, so most of the code will be duplicated 
unnecessarily if we take that route. But I can live with that if the reviewers 
can all live with the increased size of the patchset.

bq. name.toByteBuffer() convert data from one representation to another for no 
real reason

Why do you say no real reason? This is the serialization format, so we have 
to convert to it. That's the definition of what toByteBuffer() should return. 
We only call it when writing to disk or to the network, and is no different 
from the original implementation in that regard. That's not to say with time we 
cannot change this, but there's not much we can do yet.

bq. There is another concern related to ... DirectBuffer ... initialization of 
that class is synchronized and creates PhantomReference

I construct it using unsafe, which skips all constructors. So there is no 
synchronization or PhantomReference creation.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they 

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-16 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972205#comment-13972205
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


So here is the 
[branch|https://github.com/xedin/cassandra/compare/CASSANDRA-6694] which 
implements my idea of how to get rid of the Impl classes for Cell (+ does 
optimized updateDigest for both Cell implementations and couple of other 
things), I left DecoratedKey alone for now, work not fully complete yet but 
only couple on nit things are missing - I need to change couple of places to 
use CFMetaData and clone native cells so I decided not to do it if we are not 
going to go with that code.

Regarding [~benedict]'s reorg branch I found couple of problems:

# internalGetLong(long, long) is actually meant to be internalSetLong(long, 
long) in AbstractMemory;
# CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell
# CounterUpdateCell interface is missing as well as NativeCounterUpdateCell 
implementation to match it.

bq. Why do you say no real reason? This is the serialization format, so we 
have to convert to it. That's the definition of what toByteBuffer() should 
return. We only call it when writing to disk or to the network, and is no 
different from the original implementation in that regard. That's not to say 
with time we cannot change this, but there's not much we can do yet.

When taken out of context like that it doesn't really make sense but what I 
meant, there are situation where we don't really need to get BB from the 
CellName but can transfer bytes directly (especially for the native cell 
implementations). 

bq. I construct it using unsafe, which skips all constructors. So there is no 
synchronization or PhantomReference creation.

Right, we should be good there, my bad.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-15 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970462#comment-13970462
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


[~benedict] While working on trying to avoid usage of Impl classes and looking 
closer at the code I have a question, which knowing that future is going to be 
totally off-heap makes sense to ask now: current Native*Cell classes re-use 
Impl code from static implementations of interfaces but some of the methods 
e.g. reconcile for Counter(Update)Cell in certain conditions need to generate a 
new object (for now we are allocating BufferCounterCell which allows as to use 
CounterCell.Impl.reconcile for both implementations), do you have an action 
plan regarding required changes in that regard for the next step in this series 
when we are not going to copy things back to heap? 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965066#comment-13965066
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


[~jbellis] I will leave this a alone if you and others are fine with maintaing 
the code as it is in the patch set. Discussion I'm trying to have, and I 
presume others are interested too, centered around the question - if there is a 
better (cleaner if you will) way to organize Cell to avoid unnecessary field 
allocation as well as keeping us from introduction of static Impl classes with 
only static methods inside that extend each other, I still don't understand why 
we would extend one class, that has only static methods, from another with the 
same method layout (e.g. DeletedCell.Impl extends Cell.Impl) which results in 
bigger constants pool per class and has byte code implications that I have 
previously described. From my point of view, it looks like we are basically 
trying to re-build inside of Cassandra what JVM already provides as a platform.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965465#comment-13965465
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. why we can't have a simple implementation of the cell which has one buffer 
+ metadata about component sizes (which could also be encoded) instead of 
having buffer per component in the name (if composite) + buffer for value + 
long timestamp

I think this is the key question so I want to back out of the Imple rabbit hole 
for a minute to address that.  This would absolutely simplify things a great 
deal in terms of the Allocator design.  The problem is that it has a much 
bigger impact on the rest of the code, and the consensus from the last ticket 
was, We want to have off-heap as an option, but we want the default to stay 
on-heap and change as little as possible.  So, I agree that what you are 
saying is cleaner but I think we should push it out to 3.0 given the 
constraints for 2.1.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965481#comment-13965481
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. Can we decide if we actually want to have Cell (and derivatives) as this 
patch set proposes (with static Impl static classes which is OOP unfriendly to 
say the least) or do something else (question raised back in CASSANDRA-6689)?

If we accept the NativeCell/BufferCell distinction above, then the combination 
of optimization and lack of multiple inheritance drives this design or 
something like it.  Specifically, we want NativeCell to be both a Cell and a 
NativeAllocation, so Benedict has (reasonably, IMO) chosen to extend NA and 
leave the Cell common methods in a utility Impl class.  (IMO the right OOP 
approach would be to extend Cell, making it an Abstract class instead of an 
Interface, and have NativeCell have a NA as a field instead of extending it.  
But then we're increasing the memory overhead of a NC by almost 50% which 
directly impacts our main goal here.)

I can see reasonable alternatives to where exactly the static utility methods 
live: put them in the BufferCell classes and have the Native classes reuse them 
that way, or put them in a separate class entirely, and I'm okay with either of 
those options but I don't really see them as strictly better than the Impl 
choice (which has the advantage of encapsulating what interface specifically 
they deal with, distinct from the Buffer or Native subclasses).


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965487#comment-13965487
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. Is it essential to move everything to the separate package .data ?

If I may bikeshed a bit, data is a fairly meaningless term in the Cassandra 
context and I would prefer to name it cells instead.  Otherwise, I think it's 
a reasonable refactor.

My initial reaction was, moving things to different packages should totally be 
a separate commit but the new interfaces don't share a whole lot with the old 
classes other than the name.  So even that doesn't really bother me, but if 
Pavel or Marcus still want that to facilitate review then it's a reasonable 
request.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965493#comment-13965493
 ] 

Benedict commented on CASSANDRA-6694:
-

I agree data is a bit meaningless - and, in fact, I started with cells. But 
it includes DecoratedKey / RowPosition, so data became the easiest most 
encompassing term. More than open to better suggestions.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965502#comment-13965502
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

Simple solution: leave DK and RP where they are. :)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965562#comment-13965562
 ] 

Benedict commented on CASSANDRA-6694:
-

Well, the only fly in that ointment is that they have Buffer and Native 
implementations also, and the DataAllocator allocates them as well as cells. 
So to separate them seems a bit strange - but I'm not too fussed tbh.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963907#comment-13963907
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


bq. I'm saying performance critical code is impacted when you have virtual 
method calls that cannot be optimised by the VM (i.e. those with multiple 
implementations). I meant CASSANDRA-6553 and CASSANDRA-6934

Which means that if we actually optimize AbstractType and derivatives to work 
directly with underlying bytes whole problem could be resolved? That's why I 
want to understand why we can't have a simple implementation of the cell which 
has one buffer + metadata about component sizes (which could also be encoded) 
instead of having buffer per component in the name (if composite) + buffer for 
value + long timestamp? Maybe it would be easier to offload all of the work to 
AbstractType instead of trying to optimize on the Cell level?

I went through JVM instruction set doc (specifically 
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.invokespecial
 and 
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.invokestatic)
 those methods are not that different and both have to do lookup in the 
constant_pool of that class so I'm wondering if it's virtual calls that create 
a problem or it's something else masked by that...

It also looks like if we use static Impl scheme (like in the patch set) would 
execute the same amount of instructions because compiler emits *aload_0* (this) 
in both cases before would it be invoke\{special, virtual\} or invokestatic, 
and more instructions in static Impl form if we use something else instead of 
this. Generally when callers use methods from super class or interface (as it 
is right now for e.g. Cell.dataSize()) compiler would emit *aload_0, 
invokevirtual #offset* directly to the Cell method, where with static Impl it 
has to that multiple times *aload_0, invokevirtual #offset* (to the method in 
DeleteCell.dataSize() and then internally *aload_0, invokestatic #offset* (to 
the DeletedCell.Impl.dataSize()) which means longer constant_pool walk.


bq. Then what exactly do we win? We still have to have two hierarchies and the 
same modularization. Also the potential ease of optimizations for comparison 
disappear, and we still have increased indirection and virtual method call 
costs. If this is the suggestion, I am very -1, as the payoff is very small, 
the work nontrivial and the negatives substantial.

The wins are, primarily, less object overhead (ultimate goal of all this) and 
maintainability of the code. We basically have Cell based on type - expired, 
deleted, counter, client (the last one being used mostly by Thrift) as it is 
right now, so no Buffered* or Native* plus allocators of 3 types (maybe we 
actually don't need one which allocates DirectBuffer but can just go with JNA 
backed one) which allocate raw bytes. Cell reconcile, equals, dataSize and 
other methods become straight-forward. Also, as we consider Composite as a 
complete entity, storing components as contiguous blocks would reduce container 
overhead + speeds up comparisons by exploiting spatial locality. 

[~jbellis] mentioned this My preferred solution would be, stop extracting the 
name so often by itself. Spot checking the code, it seems we usually do this 
just to simplify a comparison, so this could in principle just be done with 
the Cell object rather than just the name. I think that would would further 
benefit the approach that I'm describing.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object 

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-09 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963923#comment-13963923
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. less object overhead

There is no reduced overhead from the current patch.

bq. Also, as we consider Composite as a complete entity, storing components as 
contiguous blocks would reduce container overhead + speeds up comparisons by 
exploiting spatial locality

You seem to be backtracking to the prior suggestion of only one implementation. 
I am potentially ok with this, but see my prior comment for concerns and 
complications. The -1 was to having what we have now except with an extra level 
of indirection (i.e. one packed Cell implementation, and one componentised like 
we had before this patch). Also, I would prefer to avoid the extra indirection 
+virtual method costs of having another inner object representation, within 
which we need another offset.

The JVM instruction set is besides the point. The point is what hotspot will 
do: with a single implementor or static method of small enough bytecode 
representation, it will be inlined. Note I said multiple implementation 
virtual method. With the option you suggest we will need an extra virtual 
invocation cost with every access to the underlying bytes, some extra math to 
access the right location, and one extra object field reference to locate the 
position we're offsetting from. These costs mount up rapidly.

Hmm. No, I now note your client implementation: what exactly is this one? 
Please clarify, as the thrift cell is going to need to be compared with the 
other implementations, and suddenly much of any benefit will disappear. The 
best way to make comparisons cheap and easy is to have both sides of the 
comparison have at least the same layout. If we have to either virtual invoke 
or instanceof check for every comparison, and a different code path for 
comparing each type of representation, there will be a performance impact. As 
such the only main benefit of this approach is eliminated in my eyes. Also, how 
will this client implementation achieve its various functions, and define its 
type? Seems like you'll need a duplicate hierarchy still.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13964564#comment-13964564
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


bq. The JVM instruction set is besides the point. The point is what hotspot 
will do: with a single implementor or static method of small enough bytecode 
representation, it will be inlined. Note I said multiple implementation 
virtual method. With the option you suggest we will need an extra virtual 
invocation cost with every access to the underlying bytes, some extra math to 
access the right location, and one extra object field reference to locate the 
position we're offsetting from. These costs mount up rapidly.

How is that besides the point when you claim that method calls with multiple 
implementations are slower than (and not getting inlined) static method 
invocations from multiple classes basically constant_pool reimplementation in 
your code?... What I claim is that it doesn't matter if you override a method 
multiple times or call a static method which calls another static method like 
your patch does for DeletedCell e.g. \{Native, 
Buffer\}DeletedCell.cellDataSize() which calls 
DeletedCell.Impl.cellDataSize(this) which transfers to 
Cell.Impl.cellDataSize(this); Just make an example disassemble classes (with 
javap -c or similar) and see what bytecode did it generate. Also for inlining 
problem I would like to see the proof of reason why are those methods are not 
getting inlined (are they even touched by JIT?) by enabling logging with 
-XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining and 
sharing the output, otherwise multiple implementation virtual method being 
slow claim is just empty rhetoric.

bq. Hmm. No, I now note your client implementation: what exactly is this one? 
Please clarify, as the thrift cell is going to need to be compared with the 
other implementations, and suddenly much of any benefit will disappear. The 
best way to make comparisons cheap and easy is to have both sides of the 
comparison have at least the same layout. If we have to either virtual invoke 
or instanceof check for every comparison, and a different code path for 
comparing each type of representation, there will be a performance impact. As 
such the only main benefit of this approach is eliminated in my eyes. Also, how 
will this client implementation achieve its various functions, and define its 
type? Seems like you'll need a duplicate hierarchy still.

What was just a suggestion for temp container in between client transport and 
memtable, as those buffers are already allocated separately by thrift it seems 
reasonable to have Cell work with those buffers, it would take more memory for 
ByteBuffer containers passed from Thrift but cell comparison logic should not 
change because as they would operate on the common container type, it's similar 
contept to what Netty does with ByteBuf gathered from other ByteBuf pieces.




 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962658#comment-13962658
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. Can we decide if we actually want to have Cell (and derivatives) as this 
patch set proposes (with static Impl static classes which is OOP unfriendly to 
say the least) or do something else (question raised back in CASSANDRA-6689)?

Can we have something more concrete than something else as a suggestion?

bq. Is it essential to move everything to the separate package .data ?

No refactoring is essential - however it is much cleaner given all of the new 
classes.

bq. Maybe there is a way which allows us to still have key/value/timestamp as 
fields, so we should only change callers method/class signatures instead? In 
general the idea would be to keep a single implementation of the Cell and add a 
generic placeholder instead of ByteBuffer.

This seems to miss the entire purpose of this patch, which is to reduce the 
heap consumption of each Cell. If we use another placeholder, we will no doubt 
only *increase* the memory consumption, not decrease it; or, at best, reduce it 
only fractionally for off-heap and increase it for on-heap implementations. 
Neither are really acceptable, and would make this whole patch a bit worthless.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962679#comment-13962679
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


If I had something more concrete you would see a patch for it, but here I am 
trying to start a discussion, I think [~jbellis] mentioned that it might be 
better to reduce usage of the column names instead of merging cell with column 
name (if I remember correctly). Regarding the moving stuff around - if it's not 
essential then we can do it at the very last stage once we done with all more 
important changes which are plenty. Regarding placeholders idea, if we allocate 
contiguous region for the whole cell we can just have memory object + 1 int (or 
was it even short?...) field which marks the end of the column name at that 
buffer, as column timestamp is a fixed size long we know exactly where column 
value ends, that also helps with spatial locality in most of the cases.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962691#comment-13962691
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. I think Jonathan Ellis mentioned that it might be better to reduce usage of 
the column names instead of merging cell with column name (if I remember 
correctly)

I don't recall this suggestion. Perhaps you are referring to the suggestion 
that we not extract the cell names from the cell as often as we do, for the 
purpose of comparison, in order to reduce garbage production?

bq. Regarding placeholders idea, if we allocate contiguous region for the whole 
cell we can just have memory object + 1 int (or was it even short?...) field 
which marks the end of the column name at that buffer, as column timestamp is a 
fixed size long we know exactly where column value ends, that also helps with 
spatial locality in most of the 

In this case, this suggestion has much more complex problems:

# More (multiple implementation) virtual method invocations (as shown by 
CASSANDRA-6993 this can have meaningfully negative performance implications)
# Major refactor of AbstractType hierarchy to prevent bytebuffer allocation on 
comparison
# More object allocation in the request threads due to having to re-pack all of 
any parameters into a Cell with a single buffer, as opposed to just dropping 
them in place
# At which point it would make most sense to refactor (and mostly eliminate) 
the entirety of CASSANDRA-5417, as we're almost always pumping the result 
straight into a Cell anyway, so extracting the components into separate buffers 
and repacking them into a single buffer in the Cell is very wasteful

That said, it is *viable*. It has some advantages too: the comparisons between 
Native and Buffer cells are much more easily optimised. Many of these changes 
may well need to happen in the natural course of things anyway as we optimise 
the native implementation. But it has comparatively wide-ranging implications 
for the current on-heap use case that might be a bit too much to bite off right 
now.

bq. if it's not essential then we can do it at the very last stage once we done 
with all more important changes which are plenty

I disagree. It makes the patch more complicated to *not* move it around. 
Because something is not essential does not mean it is not the better option

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963063#comment-13963063
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. It makes the patch more complicated to not move it around.

I thought Pavel was referring to moving existing classes into new packages, but 
I may be mistaken.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963100#comment-13963100
 ] 

Benedict commented on CASSANDRA-6694:
-

Yes, but I've gutted those classes into interfaces and introduced new sister 
classes. And modelling those (in my head, at least) is very difficult when it's 
not easy to see what classes relate to each other at a glance, and the db 
package is overpopulated as it is. So anything I need to do to make it possible 
to think about whilst writing it, I assume is going to be helpful for anybody 
else reading it.

But if you both want to change that bit, I can rebase again. Since I've written 
it, it doesn't matter so much to me now.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963531#comment-13963531
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


bq. More (multiple implementation) virtual method invocations (as shown by 
CASSANDRA-6993 this can have meaningfully negative performance implications)

I'm getting mixed signals here, are you claiming that JVM does a bad job or OOP 
is broken in general? Also CASSANDRA-6993 seems to point to a different problem.

bq. Major refactor of AbstractType hierarchy to prevent bytebuffer allocation 
on comparison

I don't see a problem with this if it spares most of the changes in allocators 
and Cell*/DecoratedKey rewrites.

bq. More object allocation in the request threads due to having to re-pack all 
of any parameters into a Cell with a single buffer, as opposed to just dropping 
them in place

We can have a Cell separate implementation with multiple buffers as Thrift 
allocates them anyway which we are going to be transformed to linear ones once 
they get into memtable as we have to reallocate there.

bq. At which point it would make most sense to refactor (and mostly eliminate) 
the entirety of CASSANDRA-5417, as we're almost always pumping the result 
straight into a Cell anyway, so extracting the components into separate buffers 
and repacking them into a single buffer in the Cell is very wasteful

Is it good or bad to refactor it? [~slebresne]/[~jbellis] WDYT?



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963619#comment-13963619
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. I'm getting mixed signals here, are you claiming that JVM does a bad job or 
OOP is broken in general? Also CASSANDRA-6993 seems to point to a different 
problem.

I'm saying performance critical code is impacted when you have virtual method 
calls that cannot be optimised by the VM (i.e. those with multiple 
implementations). I meant CASSANDRA-6553 and CASSANDRA-6934

bq. We can have a Cell separate implementation with multiple buffers as Thrift 
allocates them anyway which we are going to be transformed to linear ones once 
they get into memtable as we have to reallocate there.

Then what exactly do we win? We still have to have two hierarchies and the same 
modularisation. Also the potential ease of optimisations for comparison 
disappear, and we still have increased indirection and virtual method call 
costs. If this is the suggestion, I am very -1, as the payoff is very small, 
the work nontrivial and the negatives substantial.



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-07 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962580#comment-13962580
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I looked through the code and here is the couple of questions that I have/had:

# Can we decide if we actually want to have Cell (and derivatives) as this 
patch set proposes (with static Impl static classes which is OOP unfriendly to 
say the least) or do something else (question raised back in CASSANDRA-6689)?
# Is it essential to move everything to the separate package .data ?
# Do we want to have a number of pool/allocator implementations which allocate 
different type of objects or is it possible to make a generic container (for 
ByteBuffer/Memory) which would basically be a pointer to a bigger buffer that 
holds all of the components (name/value/timestamp) so we can have limited 
number of allocators/pools to maintain ([~jbellis] described the same vision in 
one of his previous comments)... Maybe there is a way which allows us to still 
have key/value/timestamp as fields, so we should only change callers 
method/class signatures instead? In general the idea would be to keep a single 
implementation of the Cell and add a generic placeholder instead of ByteBuffer.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959841#comment-13959841
 ] 

Benedict commented on CASSANDRA-6694:
-

rebased and pushed -f

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960112#comment-13960112
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

Is there a case to be made here that there's more abstraction than necessary?  
Because I'm still having trouble wrapping my head around it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960117#comment-13960117
 ] 

Benedict commented on CASSANDRA-6694:
-

Well, it's probably indicative of something wrong, but I don't think it's the 
level of abstraction. Probably I can re-organise it to make it clearer, though.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960187#comment-13960187
 ] 

Benedict commented on CASSANDRA-6694:
-

Rebased, reorganised and pushed to 
[6694-reorg|https://github.com/belliottsmith/cassandra/tree/6694-reorg]

Does that make it clearer what's going on?

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960192#comment-13960192
 ] 

Benedict commented on CASSANDRA-6694:
-

We basically have:

BBAllocator (and implementors)
BBPool + BBPoolAllocator (and implementors)
NativePool + NativeAllocator

BBPoolAllocator creates a BBAllocator per session, by wrapping the session's 
OpOrder.Group

BBAllocator is used to construct Buffer* implementations (necessary without 
further refactor, as that's how CellName implementors work, and don't want to 
rip those apart in this commit);
DataAllocator wraps the above to create arbitrary implementations (i.e. Native* 
or Buffer*, atm)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960864#comment-13960864
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Sorry guys, I've been busy with multiple things this week, will try to take a 
look at this on weekend.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958124#comment-13958124
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

Can you give an overview of all the classes in memory/ and how they fit 
together?

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-02 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958166#comment-13958166
 ] 

Benedict commented on CASSANDRA-6694:
-

Sure. There are now four main concepts that interact in various ways:

ByteBufferAllocator, Pool  PoolAllocator,  ByteBufferPool.Allocator (and its 
implementors) and NativeAllocator

# ByteBufferAllocator, as the name suggests, is a straightforward abstraction 
for the allocation/cloning of NIO ByteBuffers. It does not directly support any 
concept of pooling, nor understand the use of OpOrder for write guarding.
# Pool and PoolAllocator are now independent of any concept of *what* they 
allocate - they simply manage the idea of the memory resources, and leave the 
actual allocation to the implementing class
# ByteBufferPool.Allocator is the combination of PoolAllocator and BBA, 
although it itself isn't a BBA - it constructs a context BBA when given a 
writeOp that is guarding the allocation. This helps to keep the concept of 
write guarded pooled allocations cleanly separated from simple BBA allocations, 
whilst using the same code paths.  Note that BBP.A is _abstract_ and is 
implemented by SlabAllocator and HeapPool.Allocator. We might consider renaming 
SlabAllocator to HeapSlabAllocator to keep naming consistent and help clarity.
# NativeAllocator is, by contrast, the extension of PoolAllocator that supports 
native allocations - that is, any object that extends NativeAllocation.



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-25 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946848#comment-13946848
 ] 

Benedict commented on CASSANDRA-6694:
-

Pushed a rebased branch against latest CASSANDRA-6689

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947041#comment-13947041
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

So our classic Cell is ByteBuffer name + ByteBuffer value + long timestamp.

After 6689 the buffers can be off-heap but just having the DirectBuffer objects 
on heap is still a lot.  So we want to introduce NativeCell which is basically 
just a pointer to off-heap memory containing all 3.

Why do we need all the Pool and Allocator changes for that?  Can't we allocate 
NativeCell on the old SlabAllocator?

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-25 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947318#comment-13947318
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. Why do we need all the Pool and Allocator changes for that? Can't we 
allocate NativeCell on the old SlabAllocator?

I want to avoid the ugly situation where one allocator can make two kinds of 
allocation, another can only make one, and it's not clear how or why, or which 
kind to allocate. I've separated the types of allocators, is all, and 
introduced an extra abstraction, DataAllocator, which encapsulates how one of 
the other type of allocator can be used to construct cells/keys, and also how 
to clean them up.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944885#comment-13944885
 ] 

Sylvain Lebresne commented on CASSANDRA-6694:
-

bq. I see a 25% throughput improvement using offheap_objects as the allocator 
type vs either on/off heap buffers.

That's definitively good to know, but does that suggest that without this, 
there isn't much notable performance difference between on and off heap 
buffers? Because if that's the case, I'm still of the opinion that it could be 
worth moving this to 3.0 on the argument that we've moved stuff in 2.1 last 
minute enough as it is. 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944889#comment-13944889
 ] 

Benedict commented on CASSANDRA-6694:
-

This specific patch only permits larger amounts of data to be retained in 
memtables; the only speed-wise performance implications of this are the ones 
stated here, i.e. improved write throughput through reduced write amplification 
and the writing of larger files.

For this workload there's basically no difference between on and off-heap 
(CASSANDRA-6689) ByteBuffer backed storage,  if that's what you're asking, 
because the on-heap overhead still heavily outweighs the off-heap utilisation. 
This would not be true for workloads with large per-column payloads.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-23 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944357#comment-13944357
 ] 

Benedict commented on CASSANDRA-6694:
-

FTR, with a very simple and short performance comparison, simulating writing a 
lot of small (integer) fields, using cassandra-stress write n=40 -col 
size=fixed\(4\) n=fixed\(100\), I see a 25% throughput improvement using 
offheap_objects as the allocator type vs either on/off heap buffers.

I should expect to see performance improve further as the length of the test 
increases, as write amplification takes its toll more rapidly on the heap 
buffers.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944212#comment-13944212
 ] 

Benedict commented on CASSANDRA-6694:
-

Patch available 
[here|https://github.com/belliottsmith/cassandra/tree/6694-moreoffheap]

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922529#comment-13922529
 ] 

Benedict commented on CASSANDRA-6694:
-

Quick update: I realised I had accidentally included a partial image of the 
changes I was making for CASSANDRA-6781 in the offheap2c I uploaded. I've fixed 
the repository by rolling FastByteComparisons back, since that shouldn't have 
been included in this ticket.

I've also uploaded another 
[tree|https://github.com/belliottsmith/cassandra/tree/offheap2c+6781] which 
includes 6781 for anyone who wants to performance test. I'm in the process of 
looking at this now.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-03 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917897#comment-13917897
 ] 

Benedict commented on CASSANDRA-6694:
-

Pushed another update, with a slight modification that improves the behaviour 
when deleting stale entries in KeysSearcher and CompositesSearcher 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916722#comment-13916722
 ] 

Benedict commented on CASSANDRA-6694:
-

Uploaded [offheap2c|https://github.com/belliottsmith/cassandra/tree/offheap2c] 
which contains [~slebresne]'s changes for ByteBuffer and underscore removal 
(since this is actually all in code only in this patch it hopefully shouldn't 
be too much trouble for future merges)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909343#comment-13909343
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. My preferred solution would be, stop extracting the name so often by itself

Splitting this out into a separate ticket: CASSANDRA-6755

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-21 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908198#comment-13908198
 ] 

Benedict commented on CASSANDRA-6694:
-

Pushed an updated branch 
[here|https://github.com/belliottsmith/cassandra/tree/offheap2a]

This is merged with latest changes in 2.1, and patches up a few oversights as 
well as squashes the zillions of commits (I've left the prior branch intact in 
case a reviewer wants to walk the history, particularly across the refactor).

[~mshuler]: For testing this, we're interested in inducing worst case behaviour 
by, e.g., getting to a steady state where the memtables are full (might want to 
jack memtable_cleanup_threshold to near to 1, so we can push it further. Also, 
that makes me realise we should assert it is  1 at startup :-) ), but using a 
heavily clustered schema; say at least 3 clustering columns, all small, e.g. 
ints. Use a small value column as well; maybe a double. Then pack a LOT of CQL 
rows into a single partition. Say 1M+. Then we want to see what performance is 
like querying from this. I expect it to be worse, but the question is how much 
worse, and can we tolerate it? Probably want to run variants of this with a 
mixed workload as well, but I expect any difference will be lost in the noise 
of the current garbage heavy write layer.

Make sure you run this test connecting through native protocol, and for 
comparison test against same version but with memtable_allocator set to 
heap_slab. Make sure you tune memtable thresholds so they both retain the same 
amount of data :-)



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-21 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908336#comment-13908336
 ] 

Sylvain Lebresne commented on CASSANDRA-6694:
-

I know I'll sound like a killjoy, but quickly skimming over the patch, this is 
a big patch that moves a lot of code, and more importantly to me, seems to add 
a lot of clutter in places that a priori don't seem all that much related 
(everywhere in the CQL3 code typically). I'm not saying that such changes 
aren't necessary, nor really criticizing the approach (I certainly haven't look 
at the patch close enough for that), just remarking that this is a lot of 
changes, and, it seems to me, a non negligible amount of added clutter.

With that in mind, I don't necessary have a problem with this change in theory 
if such change has demonstrable non-negligible positive impact on performance, 
but I do have 2 small worries:
# I do would want to see tests that demonstrable the non-negligible positive 
impact on performance before committing such change, not *just* tests that show 
this doesn't degrade performance. This might be the intent of everyone and I 
might just be stating the obvious, and my apologies for the distraction if 
that's the case, but the comments above seems to mainly mention the check it 
doesn't degrade things part, and so I just want to make sure we're on the same 
page. Are we?
# this is marked for 2.1-beta2. Is it really reasonable to target big (and 
rather intrusive) changes like that for 2.1 at this point? Does that really 
give us the proper time to review such changes and evaluate the gains it give 
use before committing it?


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908521#comment-13908521
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. The basic idea is that we have made Cell and DecoratedKey both interfaces, 
and we have buffer and native implementations. The native implementations 
squash the implementation of CellName into the same object, so that we can 
avoid any allocation overhead, and so we don't need to allocate a new object 
every time we read the name. 

With you so far.

bq. As a result we have had to go a little anti-OOP; DecoratedKey and *Cell are 
now interfaces, with static implementation modules, the methods of which are 
invoked by each implementation with themselves as the first parameter.

Not sure I follow, I only see static methods sizeOf and construct in NativeCell 
for instance.

bq. without CASSANDRA-6697 we allocate a lot of ByteBuffers temporarily, i.e. 
whenever we read the constituents of the name or the contents of the cell

My preferred solution would be, stop extracting the name so often by itself.  
Spot checking the code, it seems we usually do this just to simplify a 
comparison, so this could in principle just be done with the Cell object rather 
than just the name.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908524#comment-13908524
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

NB: While I'm all for simplifying the RefAction impact if possible, I see it as 
similar to the Allocator we got used to passing to addColumn for the benefit of 
counter contexts, so it doesn't really bother me a whole lot [more] in 
principle.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-21 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908537#comment-13908537
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. Not sure I follow, I only see static methods sizeOf and construct in 
NativeCell for instance.

These are in a subclass called Impl within each *Cell interface, so it is 
shared between Buffer*Cell and Native*Cell

bq. My preferred solution would be, stop extracting the name so often by 
itself. Spot checking the code, it seems we usually do this just to simplify 
a comparison, so this could in principle just be done with the Cell object 
rather than just the name.

I'll have a closer look and see how easy this would be. As it happens, I've 
made (but not yet published) some changes to FastByteComparisons that might be 
extensible to make this work without object instantiation. If we eliminated 
allocation for name comparisons we probably would get _most_ of any benefit, so 
this might be workable.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-21 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908605#comment-13908605
 ] 

Benedict commented on CASSANDRA-6694:
-

So, at the very least we use it on each read when we return the clustering 
columns. We also do this for the value itself; also in KeySearcher and 
CompositeSearcher whilst filtering; and DecoratedKey.key() is called in a _lot_ 
of places. 

This _may_ also make the comparison code a little unpleasant (maybe not, also, 
but we have a few different kinds of comparison to deal with, and we need to 
make sure there's minimal penalty)... but I think this is probably still the 
right way forward. It won't eliminate as much garbage, but it should leave 
garbage at worst linear in number of columns and independent of _size_ of 
columns, and mostly short lived. Which is probably good enough, and it is 
definitely more sane in scope.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-18 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904708#comment-13904708
 ] 

Benedict commented on CASSANDRA-6694:
-

Initial patch for this is available 
[here|https://github.com/belliottsmith/cassandra/tree/offheap2]

The basic idea is that we have made Cell and DecoratedKey both interfaces, and 
we have buffer and native implementations. The native implementations 
squash the implementation of CellName into the same object, so that we can 
avoid any allocation overhead, and so we don't need to allocate a new object 
every time we read the name. As a result we have had to go a little anti-OOP; 
DecoratedKey and *Cell are now interfaces, with static implementation 
modules, the methods of which are invoked by each implementation with 
themselves as the first parameter. This isn't super pretty, but it isn't super 
ugly either. The ugliest thing here is that I flatten the logic from 
db.composites all into NativeCell, but it turns out this is actually really not 
very hard; they behave _mostly_ the same.

I've also quite widely refactored the stuff introduced in CASSANDRA-5549 and 
CASSANDRA-6689: the PoolAllocator in utils.memory now only defines methods for 
managing the memory use of the pool; what it *means* to allocate is now left 
to its descendants to define. We now split them up into two camps: 
ByteBufferPool and NativePool (renamed from OffHeapPool). The formers' 
allocators implement ByteBufferAllocator (formerly AbstractAllocator), whereas 
the NativeAllocator allocates NativeAllocations. With me still? These 
NativeAllocations form the basis for any objects stored off-heap.

Anyway, these PoolAllocators are now utilised by *Data*Allocators in the db 
package tree; these are comparatively simple, and I wanted to keep the guts of 
the memory management in utils.memory. These DataAllocator instances simply 
know how to clone DecoratedKey and Cell instances, and also how to tidy up any 
unused references.

Some notes:
- This (and CASSANDRA-6689) have negative implications for Thrift at the 
moment, as I have to copy any data on-heap in order to return to thrift. 
Unfortunately this can only easily be rectified by modifying thrift so that we 
have method calls we can override in the worker tasks, that are invoked when 
starting and finishing the servicing of a request.
- I've settled for a 24-byte object, as I really needed to keep some extra 
information on-heap. We can definitely tighten this in the future, but I think 
it probably isn't worth doing at this stage.
- As things stand, without CASSANDRA-6697 we allocate a lot of ByteBuffers 
temporarily, i.e. whenever we read the constituents of the name or the contents 
of the cell

It would be good to get some testing resources allocated to the first and last 
points to see if we should be trying to fix it. We should decide if we want 
CASSANDRA-6697 preferably before we go live with a final 2.1 release. What we 
really need to do is run a number of tests against schemas with a lot of 
composite columns, and see what effect there is on garbage collections, and 
latency metrics.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-02-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904728#comment-13904728
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

I bet [~mshuler] could help with the testing.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)