subject:"\[jira\] \[Commented\] \(CASSANDRA\-6694\) Slightly More Off\-Heap Memtables"


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983417#comment-13983417
 ] 

Benedict commented on CASSANDRA-6694:
-

I've pushed a completed branch 
[here|https://github.com/belliottsmith/cassandra/tree/6694-reorg2]

I've taken to completion your flattening of the PoolAllocator and DataAllocator 
hierarchies, implemented DecoratedKey, reintroduced the extra unit tests, fixed 
some bugs with the Cell hierarchy, slightly rejigged the data layout for native 
cell to simplify offset calculation and fixed a performance regression and the 
message digest optimisation.

The only thing I haven't done is the refactors I would like to perform before 
we finally commit this, so as to make review easier for others.

Note I'm still running dtests and doing some final vetting, but I wanted to 
post this message now as I reckon this version is most likely ready and this is 
somewhat time critical, and because I want to avoid any duplicated effort in 
getting a final patch together.

I think I've addressed your concern's [~iamaleksey], however with the following 
notes:

bq. getAllocator() doesn’t belong to SecondaryIndex, API-wise. CFS#logFlush() 
and CFS.FLCF#run() should just use 
SecondaryIndexManager#getIndexesNotBackedByCfs() and get their allocators 
directly instead of using SIM#getIndexes() and checking for null.

This was a conscious decision to permit custom 2i use our allocators and count 
towards book keeping for memory utilisation.

bq. Composite/CellName/CellNameType/etc#copy() all now have an extra CFMetaData 
argument, while only NativeCell really uses it. Can we isolate its usage to a 
NativeCell-specific methods and leave the rest alone?

Not sure how we do that when either can be present when you want to perform 
these calls. Possible I'm missing something obvious though, so please do let me 
know :)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983425#comment-13983425
 ] 

Benedict commented on CASSANDRA-6694:
-

Oh, also, [~iamaleksey]: Your assertion about super columns and sparse 
composites appears to be broken by the CliTest somehow. I haven't investigated, 
but this is why I introduced that branch. I've stripped out the condition and 
just always take that branch if we fail to lookup in cfMetaData, to deal with 
names being dropped whilst we aren't expecting, so it no longer assumes this 
but also copes with it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983666#comment-13983666
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

bq. This was a conscious decision to permit custom 2i use our allocators and 
count towards book keeping for memory utilisation.

I feel like you are lacking context here, wrt custom 2i actually are and what 
implementations we have. The ones that exist don't need it, so IMO this was a 
wrong decision, even if conscious. Have a look at DSE's Solr implementation if 
curious.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983670#comment-13983670
 ] 

Benedict commented on CASSANDRA-6694:
-

I don't mind dropping it, but it seems a harmless addition for users who 
implement their own buffered writes for secondary indexes, so that they can 
consider the amount of data they are using for their own state when deciding 
which CFs to flush. The fact that DSE doesn't do this doesn't mean it isn't 
useful.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983675#comment-13983675
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

Sure, but I prefer to add stuff when it's clear that there is someone who 
actually needs it, and not some hypothetical user that doesn't exist.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983708#comment-13983708
 ] 

Benedict commented on CASSANDRA-6694:
-

To make shipping easier, I've pushed a rebased and squashed branch 
[here|https://github.com/belliottsmith/cassandra/tree/6694-reorg2-rebase]

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-28 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983709#comment-13983709
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Here is a [new branch|https://github.com/xedin/cassandra/compare/6694-final] 
with all of the changes from 6694-reorg2 squashed and added a couple of commits 
to cleanup and remove secondary index getAllocator which is unnecessary right 
now. I was about to push similar refactoring for memtable pools to my branch, 
which made review much faster :)

I'm +1 on combination of the squashed changes and cleanup which is in my 
branch, still not sure about CellName implementation in AbstractNativeCell tho, 
that is not my realm, so it would be nice if Sylvain (or somebody as close to 
that code, if anybody) could take a look...

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-25 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980847#comment-13980847
 ] 

Benedict commented on CASSANDRA-6694:
-

On the whole it looks good, but I have the following comments/concerns:

# DecoratedKey still isn't implemented (should be a relatively minor addition)
# The performance regression for MessageDigest updating is still there
# AbstractCell.localCopy(..MemtableAllocator) needs to be overridden; as it is 
you'll always get a regular Cell back
# You're still using static method implementations, it looks like? Cell.diff 
and Cell.reconcile
# I'm not a fan of mixing the util.memory hierarchy with knowledge of the 
memtable hierarchy. If we plan on this, I'd much prefer to move the whole lot 
into e.g. db.memtable; this might make most sense anyway
# I'd like to move the Cell implementations out of db into something (e.g. 
.memtable) as it's very crowded in there, and they're a dozen or so related 
classes that are easily extracted
# Given how many different kinds of allocator we now have (including 
IAllocator), I'd really like to rename AbstractAllocator to something more 
descriptive like ByteBufferAllocator

Still need to verify all of the changes within the Cells, as comparison is 
currently tricky due to different hierarchy confusing git, but on the whole 
this branch is good if we address these concerns.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-25 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981434#comment-13981434
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


1-2, 5-7 I will address once the main functionality is settled.

bq. AbstractCell.localCopy(..MemtableAllocator) needs to be overridden; as it 
is you'll always get a regular Cell back

Good catch, I forgot to change that before I pushed. Have amended it to the 
original allocator commit and force pushed to my branch, so it's available.

 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-24 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980585#comment-13980585
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I have pushed allocation pools and minor refactoring to [my 
branch|https://github.com/xedin/cassandra/compare/CASSANDRA-6694], also 
addressed some of the problems from [~iamaleksey]'s comment expect concerns 
about CellName implementation for AbstractNativeCell which are mutual. 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-22 Thread Aleksey Yeschenko (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1394#comment-1394
]

Aleksey Yeschenko commented on CASSANDRA-6694:
--

h3. Benedict’s original branch

ABTC.ColumnUpdater#apply() calls update.reconcile(existing) and skips
localCopy() if reconciled == existing. This means that we should optimise all
reconcile() implementations to prioritise the argument cell in case of ties for
optimal savings (and ties happen often enough, from retries and whatnot +
potentially counter updates if we decide to do that thing when batch commit log
is enabled). Currently we do the opposite. Would be easiest to simply swap the
call to existing.reconcile(update).

getAllocator() doesn’t belong to SecondaryIndex, API-wise. CFS#logFlush() and
CFS.FLCF#run() should just use SecondaryIndexManager#getIndexesNotBackedByCfs()
and get their allocators directly instead of using SIM#getIndexes() and
checking for null.

Composite/CellName/CellNameType/etc#copy() all now have an extra CFMetaData
argument, while only NativeCell really uses it. Can we isolate its usage to a
NativeCell-specific methods and leave the rest alone?

At least NativeCell#cql3ColumnName() can throw NPE when calling
metadata.getColumnDefinition(buffer).name. Just because it’s SIMPLE_SPARSE
doesn’t mean all the column names are predefined - it’s legal to insert
non-predefined cells w/ default_validator validator via Thrift/CQL2.

NativeCell#copy(), COMPOUND_SPARSE branch - there is no way a compound sparse
comparator and cfType = Super can coexist. Supers are all compound dense.

Generally, NativeCell methods seem to assume a bit too much about the sizes and
about what can and what can’t be present/absent. You can even guarantee
presence of a ColumnIdentifier for COMPOUND_SPARSE, and yet NativeCell#copy()
would throw an AssertionError is that’s the case. And CFMetaData is mutable,
too, and it is possible to remove a column via ALTER TABLE at any time.

I’m not comfortable +1-ing it until Sylvain has a look at at least these bits
(just the NativeCell methods).

Allocator hierarchy is confusing - I won’t claim having understood it entirely,
as are the names there. ‘Data’ prefix in DataAllocator is absolutely
meaningless in the context. Maybe MemtableAllocator would be more meaningful?
Don’t have suggestions for the rest of the names and for making that hierarchy
more straightforward, but I can live with it as it is.

I very much dislike the Impl thing though. This is an uncomfortable step back
in Cell* hierarchy readability. Basic things like using IDEA’s Find Usages on
Cell.Impl#localCopy() not showing Counter/Expiring/Deleted counterparts’ usage
are annoying. This is my largest, and, really the only fundamental issue with
the branch. Other than that, and too many assumptions in certain NativeCell
methods, I’m okay with the branch.

Overall it looks reasonable, and is actually less invasive than I was afraid it
would be.

Nits: AbstractMemory formatting is all messed up.

h3. Pavel’s refactoring branch

Doesn’t build (although trivial-ish to make it build) and is incomplete (as
expected), and that does complicate judging the ugliness of the result.

Same issues and potential issues in AbstractNativeCells methods as in
NativeCell methods in the other branch.

Can’t form an opinion on Pavel’s Allocator/Pool approach, because it’s not here
yet, and I’m not sure I got it right from just reading the comments.

This *Cell hierarchy, though, I feel a lot more comfortable with.

I feel strongly that we should borrow the Impl-less Cell hierarchy from this
branch, if nothing else (and there isn’t much else yet) - this is my biggest
issue with the original.

As for the rest of it - the time is running low, we have to ship 2.1
eventually. Any chance you could flesh it out in the next few days, maybe until
Monday, Pavel? If not, I’m not sure if we should block beta2 further :\

Slightly More Off-Heap Memtables

Key: CASSANDRA-6694
URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Benedict
Assignee: Benedict
Labels: performance
Fix For: 2.1 beta2

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975540#comment-13975540
 ] 

Benedict commented on CASSANDRA-6694:
-

[~iamaleksey] have you had a chance to take a look and form an opinion? I'm 
happy to proceed with either approach, but we want to get a move on with one or 
the other if we intend to include this in 2.1.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975677#comment-13975677
 ] 

Aleksey Yeschenko commented on CASSANDRA-6694:
--

[~benedict] It's on the top of my TODO list - so I'm looking at your branch 
now. Haven't looked at Pavel's yet. Need a few days, unless something more 
urgent distracts me (which is very unlikely).

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972524#comment-13972524
]

Benedict commented on CASSANDRA-6694:
-

So, on the whole I really don't perceive this approach as better: there's a
great deal of code duplication now (set to get worse still when you finish the
refactor for DecoratedKey), between each of the correspondingly named cell
implementations. Personally I think the Impl approach is neater as a result of
avoiding that (this may be more pronounced if we decide to optimise equals() is
you suggested). That said, if this moves us forwards I can live with it, if you
can address point 1 below.

There are a few problems though:

# I am *very* opposed to a public setPeer() method. This is a deal breaker for
me - but it can be avoided with a bit more refactoring.
# Your optimised updateDigest function is actually much slower than the old
implementation for all but the smallest values: an optimised version needs to
batch the contents into an array (stored in a ThreadLocal) and call
updateDigest with the array, unless the total size is very small (there's a
crossover point on my laptop of about 12 bytes, under which it's faster to call
update(byte)).
# AbstractNativeCell.getBytes actually calls setBytes
# excessHeapSize... should be unsharedHeapSize...
# There should be no hashCode method in Buffer\*Cell - I removed these for a
reason. Because we can have a Cell that is a CellName, and vice-versa, using a
Cell as a key for a map is likely dangerous. Since we don't do it anywhere,
it's safe to simply remove the methods.

There may be other minor issues, I'll hold off giving it a formal review until
we decide the direction we're going. To respond to a few of your comments:

bq. CounterUpdateCell interface is missing as well as NativeCounterUpdateCell
implementation to match it.

There shouldn't be one for the time being - we can never construct one.

bq. CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell

Same reason - it doesn't exist as either or, so I made a conscious decision to
leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind of
unimportant. It's purpose is somewhat different, and I think it is better left
named CounterUpdateCell, as that is its purpose (to carry a counter update as
far as the memtable, and no further).

bq. Impl classes extends another Impl classes which doesn't make much sense as
all of the methods are static.

This brings in the namespace of the extended class' static methods, which is
useful.

bq. When taken out of context like that it doesn't really make sense but what I
meant, there are situation where we don't really need to get BB from the
CellName but can transfer bytes directly (especially for the native cell
implementations).

Sure, but again: scope of ticket, and care needs to be taken when doing this
(e.g. your updateDigest modifications)

Slightly More Off-Heap Memtables

The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as
the on-heap overhead is still very large. It should not be tremendously
difficult to extend these changes so that we allocate entire Cells off-heap,
instead of multiple BBs per Cell (with all their associated overhead).
The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6
bytes per cell on average for the btree overhead, for a total overhead of
around 20-22 bytes). This translates to 8-byte object overhead, 4-byte
address (we will do alignment tricks like the VM to allow us to address a
reasonably large memory space, although this trick is unlikely to last us
forever, at which point we will have to bite the bullet and accept a 24-byte
per cell overhead), and 4-byte object reference for maintaining our internal
list of allocations, which is unfortunately necessary since we cannot safely
(and cheaply) walk the object graph we allocate otherwise, which is necessary
for (allocation-) compaction and pointer rewriting.
The ugliest thing here is going to be implementing the various CellName
instances so that they may be backed by native memory OR heap memory.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972776#comment-13972776
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


To address all of your comments this is not intended for any kind of review 
yet, it is just an idea demonstration that's why I basically carried over all 
of the methods from original implementations, didn't rename or move stuff. Also 
I'm fine if methods in both implementations are going to return constant values 
like serializationFlags or isMarkedForDeleted, a part from that there is not 
much of the code duplication, duplication is also going to be minimized when 
hashCode and other methods go away, which would probably only leave us with 
dataSize and serializedSize duplication but I guess we can come up with 
something clever for native cells there too. Regarding the point about 
updateDigest - it's meant more like representation of kind of things we can do 
if we have two different implementations of it, not optimized for performance 
yet.

bq. There shouldn't be one for the time being - we can never construct one.

and 

bq. Same reason - it doesn't exist as either or, so I made a conscious decision 
to leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind 
of unimportant. It's purpose is somewhat different, and I think it is better 
left named CounterUpdateCell, as that is its purpose (to carry a counter update 
as far as the memtable, and no further).

It is constructed in ColumnFamily and ColumnSerializer. If it's supposed to be 
only one implementation for now let's name it appropriately and use like all 
other buffered cells.

bq. This brings in the namespace of the extended class' static methods, which 
is useful.

By why do we care and what does it give us as those interfaces are called 
directly and static methods don't override each other?

bq. Sure, but again: scope of ticket, and care needs to be taken when doing 
this (e.g. your updateDigest modifications)

I don't really follow what are you implying with that, the scope is introduce 
native implementations as optimized as possible so why do we miss out of such 
low hanging fruit?...



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972796#comment-13972796
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. the scope is introduce native implementations -as optimized as possible-

Otherwise we need to do a lot more than the changes you are suggesting :)

bq.  Also I'm fine if methods in both implementations are going to return 
constant values like serializationFlags or isMarkedForDeleted

Well, these are still duplication - it is not clear as a result where the 
definition of these behaviours live. If the semantics change in future, it may 
introduce errors unnecessarily. Either way equals(),  reconcile() and 
validateFields() will still be issues. You don't seem to have implemented most 
of these methods yet (looks like your code doesn't actually compile). These 
methods are each non-trivial amounts of code duplication, equals() especially 
so is we optimise it as you want to. CounterCell.diff() will also need to be 
duplicated.

But, like I said, I can probably live with all of this if we address the 
setPeer() issue. equals() should probably still end up in a shared static 
method, at the very least, though.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972832#comment-13972832
 ] 

Marcus Eriksson commented on CASSANDRA-6694:


I'm +1 on [~benedict]s branch (have not looked at the one by [~xedin] yet)

nits;
* A few methods in Cell.Impl look redundant, isMarkedForDelete/isLive for 
example, kept around for symmetry?
* License header in DeletedCell and ExpiringCell
* Javadoc comment in NativeAllocator looks wrong

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973220#comment-13973220
]

Pavel Yaskevich commented on CASSANDRA-6694:

bq. Well, these are still duplication - it is not clear as a result where the
definition of these behaviours live. If the semantics change in future, it may
introduce errors unnecessarily. Either way equals(), reconcile() and
validateFields() will still be issues. You don't seem to have implemented most
of these methods yet (looks like your code doesn't actually compile). These
methods are each non-trivial amounts of code duplication, equals() especially
so is we optimise it as you want to. CounterCell.diff() will also need to be
duplicated.

Most of the duplicated methods are methods with static behavior which is not
going to change e.g. isMarkedForDelete, getMarkedForDeleteAt or
serializationFlags. CounterCell.diff and reconcile are living in the interface
for now. I will address setPeer(long) problem and hashCode.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973226#comment-13973226
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. CounterCell.diff and reconcile are living in the interface for now

Ah. This is a Java 8 only feature, which is why I missed it. Not really 
feasible.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973297#comment-13973297
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I'm not talking about default methods in interfaces, I'm just saying that I 
added static diff/reconcile to CounterCell for now :)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-17 Thread Aleksey Yeschenko (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973395#comment-13973395
]

Aleksey Yeschenko commented on CASSANDRA-6694:
--

bq. It's purpose is somewhat different, and I think it is better left named
CounterUpdateCell, as that is its purpose (to carry a counter update as far as
the memtable, and no further).

FWIW it doesn't even make it to a memtable in 2.1, ever. That said, not calling
it BufferCounterUpdateCell would be bothering my consistency OCD, a lot, and
I'm not done with counters until 3.0. Can you make my OCD a tiny favor and call
it consistently with the other implementations? (: Thanks.

bq. There should be no hashCode method in Buffer*Cell - I removed these for a
reason. Because we can have a Cell that is a CellName, and vice-versa, using a
Cell as a key for a map is likely dangerous. Since we don't do it anywhere,
it's safe to simply remove the methods.

Maybe we should just throw UnsupportedOperationException then, but leave the
methods? I agree that using Cell-s as keys is very unlikely, but stuff like
this has bitten us before.

Haven't read either branch yet, but planning to soon, just wanted to jump at
the opportunity to bikeshed a bit.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973405#comment-13973405
]

Benedict commented on CASSANDRA-6694:
-

bq. Can you make my OCD a tiny favor and call it consistently with the other
implementations? (: Thanks.

Sure. I have a preference to keep it that way, but not a strong one.

bq. Maybe we should just throw UnsupportedOperationException then, but leave
the methods? I agree that using Cell-s as keys is very unlikely, but stuff like
this has bitten us before.

Also sure.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973435#comment-13973435
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Regarding, the hashCode that's what we do, I do it in AbstractCell now, 
Benedict does it in both BufferCell and NativeCell.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973534#comment-13973534
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Ok, hashCode and setPeer changes are now pushed to the same branch, 
AbstractNativeCell is independent of NativeAllocation now because 
NativeAllocator returns aligned peer directly, which allows peer field to be 
made final in AbstractNativeCell. Also I have pushed set/get logic for data 
size associated with the pointer to the NativeAllocator as it's basically it's 
metadata, IMO it's a bit cleaner comparing to how that is done in Benedict's 
branch where NativeAllocation tracks pointer alignment to size (internalPeer() 
{ return peer + 4; }) but NativeAllocator takes care of allocating 4 additional 
bytes to requested size.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973538#comment-13973538
 ] 

Benedict commented on CASSANDRA-6694:
-

I don't think this is the right approach: with the changes we are making, we 
are pretty much precluding doing anything fancy with GC (we'll have to rely on 
malloc for now). As such the size is no longer providing any useful book 
keeping information to the NativeAllocator. It should be dealt with entirely in 
the AbstractNativeCell - its concept of size is entirely unique to it for now. 
This also, separately, makes packing structs of NativeCell a lot more straight 
forward.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973571#comment-13973571
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I just don't like that in NativeAllocation we assume that NativeAllocator has 
reserved 4 bytes for us. So I decided to put everything into NativeAllocator 
and only return useful space so we don't have to + 4 every time we need a peer. 
It could be done in AbstractNativeCell which would allocate size + 4 or it 
could be done in NativeAllocator and it would tell how big allocation was based 
on the area pointer that it returned (which is was 
NativeAllocator.getDataSize(areaPointer) does) on demand, either of those 
places (AbstractNativeCell or NativeAllocator) works for me.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973582#comment-13973582
 ] 

Benedict commented on CASSANDRA-6694:
-

The only reason it was happening in NativeAllocator was to support moving the 
peer around (so you need to know how much memory you're copying). 

NativeAllocation assuming it has (i.e. _being defined as having_) a size prefix 
is fine when it is tightly coupled with NativeAllocator (like it is in my 
branch) - but once you have it as a final field in another object, 
NativeAllocator should simply have no say in the matter. It never needs to know 
the size of the allocation, so we should just redefine what our 
AbstractNativeCell considers to be its size in its sizeOf() calculation.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973595#comment-13973595
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Sure, if you like that better I will change that right away, anyhow if we need 
it in allocator for some reason we can change it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973618#comment-13973618
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Done, I have force pushed to my branch, now AbstractNativeCell is handling 
size, NativeAllocator has nothing to do with it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973633#comment-13973633
 ] 

Benedict commented on CASSANDRA-6694:
-

Thanks. Although it looks like you haven't updated any of the offsets to work 
with the new layout?

As to the other changes you've made: I do not like the pollution of 
PoolAllocator with supportsNative(). Since this branch is supposed to be 
pushing idiomatic Java usage, let's stick to using interfaces for 
specialisation since we can.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973645#comment-13973645
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Why it does - internalPeer does + 4 and internalSize does - 4 when all get/set 
methods use internalPeer() + offset. Regarding (and I was waiting for that) 
supportsNative() and allocateNative - I did that because I don't want to put 
time into adding DataAllocator and DataPool interfaces that your code has just 
yet, once it's decided which way we want to go I will remove allocateNative and 
do proper work there. This still intended as just an idea presentation for how 
to handle Cell without Impl classes.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973647#comment-13973647
]

Benedict commented on CASSANDRA-6694:
-

bq. This still intended as just an idea presentation for how to handle Cell
without Impl classes.

OK, cool. Glad we're staying on topic :)

bq. Why it does - internalPeer does + 4 and internalSize does - 4

My mistake. I was expecting to see the static OFFSET fields updated - we should
probably optimise that before we finish up (now that we can), but obviously
fine for now.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-16 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970487#comment-13970487
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Also it seems like for some of the methods e.g. updateDigest, delta, dataSize, 
diff, reconcile, hashCode etc. it would be much better to have native 
implementations which work with underlying bytes directly from day one. Some of 
them, for example, use value().remaining(), value().compareTo(), 
value().duplicate(), or name.toByteBuffer() convert data from one 
representation to another for no real reason, so we can actually end up 
generating a lot more temporary objects then we anticipate. There is another 
concern related to value() method which converts pointer to DirectBuffer, the 
problem is that (at least in OpenJDK and I think Oracle done the same) 
initialization of that class is synchronized and creates PhantomReference, 
which with most collectors only be purged by Full GC.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-16 Thread Benedict (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970603#comment-13970603
]

Benedict commented on CASSANDRA-6694:
-

bq. for now we are allocating BufferCounterCell which allows as to use
CounterCell.Impl.reconcile for both implementations

We only allocate a new object in the case that the reconcile result isn't one
or the other of the original inputs. This object is only incredibly short
lived, and we decided it was easier than passing through the allocator for
reconcile. This may be slightly worse than we'd like as a result of the
different cellname layouts, but that can be smoothed over with time. It's
cleaner in the Memtable +ABC code to keep the pool allocation separate from
this reconcile, and it's small fry compared to the other stuff we're doing on
write of counters.

bq. it would be much better to have native implementations which work with
underlying bytes directly from day one

Agreed that some of these would be nice, however we rarely (if ever) call
these, and as per your comments wrt zero-copy (CASSANDRA-6842), if we aren't
worried about copying the contents, we shouldn't be worried about allocating
temporary objects. Now, there are some methods I would say would be nice to
have native implementations of sooner than later (e.g. updateDigest), but I
don't think they're by any means _essential_. What is going to be *far* more
impactful is CASSANDRA-6755, as this has a reasonably large negative impact on
name lookups (and to a lesser degree slicing) from a memtable record.

That said, some of them would be quite easy to implement. So I'm not totally
opposed to delivering them from day 1, I just wanted this patch set to be
clearly readable and well contained. It's pretty big as it is. I think it would
be nice to put any of these optimisations in a second ticket.

If you're suggesting we drop the Impl hierarchy entirely from this patchset and
just duplicate the methods and optimise, I can maybe get behind that. However
optimising reconcile() and equals() gets ugly quickly if you want to be able to
deal with either side of the equation being one or the other (often we'll
reconcile different kinds, but not always). So we still need a shared
implementation, but one that is capable of detecting the kind of Cell on each
side, and selects the correct version of the method. Leaving very few methods
we'll be optimising in Native*Cell, so most of the code will be duplicated
unnecessarily if we take that route. But I can live with that if the reviewers
can all live with the increased size of the patchset.

bq. name.toByteBuffer() convert data from one representation to another for no
real reason

Why do you say no real reason? This is the serialization format, so we have
to convert to it. That's the definition of what toByteBuffer() should return.
We only call it when writing to disk or to the network, and is no different
from the original implementation in that regard. That's not to say with time we
cannot change this, but there's not much we can do yet.

bq. There is another concern related to ... DirectBuffer ... initialization of
that class is synchronized and creates PhantomReference

I construct it using unsafe, which skips all constructors. So there is no
synchronization or PhantomReference creation.

Slightly More Off-Heap Memtables

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-16 Thread Pavel Yaskevich (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972205#comment-13972205
]

Pavel Yaskevich commented on CASSANDRA-6694:

So here is the
[branch|https://github.com/xedin/cassandra/compare/CASSANDRA-6694] which
implements my idea of how to get rid of the Impl classes for Cell (+ does
optimized updateDigest for both Cell implementations and couple of other
things), I left DecoratedKey alone for now, work not fully complete yet but
only couple on nit things are missing - I need to change couple of places to
use CFMetaData and clone native cells so I decided not to do it if we are not
going to go with that code.

Regarding [~benedict]'s reorg branch I found couple of problems:

# internalGetLong(long, long) is actually meant to be internalSetLong(long,
long) in AbstractMemory;
# CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell
# CounterUpdateCell interface is missing as well as NativeCounterUpdateCell
implementation to match it.

bq. Why do you say no real reason? This is the serialization format, so we
have to convert to it. That's the definition of what toByteBuffer() should
return. We only call it when writing to disk or to the network, and is no
different from the original implementation in that regard. That's not to say
with time we cannot change this, but there's not much we can do yet.

When taken out of context like that it doesn't really make sense but what I
meant, there are situation where we don't really need to get BB from the
CellName but can transfer bytes directly (especially for the native cell
implementations).

bq. I construct it using unsafe, which skips all constructors. So there is no
synchronization or PhantomReference creation.

Right, we should be good there, my bad.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-15 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970462#comment-13970462
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


[~benedict] While working on trying to avoid usage of Impl classes and looking 
closer at the code I have a question, which knowing that future is going to be 
totally off-heap makes sense to ask now: current Native*Cell classes re-use 
Impl code from static implementations of interfaces but some of the methods 
e.g. reconcile for Counter(Update)Cell in certain conditions need to generate a 
new object (for now we are allocating BufferCounterCell which allows as to use 
CounterCell.Impl.reconcile for both implementations), do you have an action 
plan regarding required changes in that regard for the next step in this series 
when we are not going to copy things back to heap? 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965066#comment-13965066
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


[~jbellis] I will leave this a alone if you and others are fine with maintaing 
the code as it is in the patch set. Discussion I'm trying to have, and I 
presume others are interested too, centered around the question - if there is a 
better (cleaner if you will) way to organize Cell to avoid unnecessary field 
allocation as well as keeping us from introduction of static Impl classes with 
only static methods inside that extend each other, I still don't understand why 
we would extend one class, that has only static methods, from another with the 
same method layout (e.g. DeletedCell.Impl extends Cell.Impl) which results in 
bigger constants pool per class and has byte code implications that I have 
previously described. From my point of view, it looks like we are basically 
trying to re-build inside of Cassandra what JVM already provides as a platform.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965465#comment-13965465
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. why we can't have a simple implementation of the cell which has one buffer 
+ metadata about component sizes (which could also be encoded) instead of 
having buffer per component in the name (if composite) + buffer for value + 
long timestamp

I think this is the key question so I want to back out of the Imple rabbit hole 
for a minute to address that.  This would absolutely simplify things a great 
deal in terms of the Allocator design.  The problem is that it has a much 
bigger impact on the rest of the code, and the consensus from the last ticket 
was, We want to have off-heap as an option, but we want the default to stay 
on-heap and change as little as possible.  So, I agree that what you are 
saying is cleaner but I think we should push it out to 3.0 given the 
constraints for 2.1.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965481#comment-13965481
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. Can we decide if we actually want to have Cell (and derivatives) as this 
patch set proposes (with static Impl static classes which is OOP unfriendly to 
say the least) or do something else (question raised back in CASSANDRA-6689)?

If we accept the NativeCell/BufferCell distinction above, then the combination 
of optimization and lack of multiple inheritance drives this design or 
something like it.  Specifically, we want NativeCell to be both a Cell and a 
NativeAllocation, so Benedict has (reasonably, IMO) chosen to extend NA and 
leave the Cell common methods in a utility Impl class.  (IMO the right OOP 
approach would be to extend Cell, making it an Abstract class instead of an 
Interface, and have NativeCell have a NA as a field instead of extending it.  
But then we're increasing the memory overhead of a NC by almost 50% which 
directly impacts our main goal here.)

I can see reasonable alternatives to where exactly the static utility methods 
live: put them in the BufferCell classes and have the Native classes reuse them 
that way, or put them in a separate class entirely, and I'm okay with either of 
those options but I don't really see them as strictly better than the Impl 
choice (which has the advantage of encapsulating what interface specifically 
they deal with, distinct from the Buffer or Native subclasses).


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965487#comment-13965487
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. Is it essential to move everything to the separate package .data ?

If I may bikeshed a bit, data is a fairly meaningless term in the Cassandra 
context and I would prefer to name it cells instead.  Otherwise, I think it's 
a reasonable refactor.

My initial reaction was, moving things to different packages should totally be 
a separate commit but the new interfaces don't share a whole lot with the old 
classes other than the name.  So even that doesn't really bother me, but if 
Pavel or Marcus still want that to facilitate review then it's a reasonable 
request.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965493#comment-13965493
 ] 

Benedict commented on CASSANDRA-6694:
-

I agree data is a bit meaningless - and, in fact, I started with cells. But 
it includes DecoratedKey / RowPosition, so data became the easiest most 
encompassing term. More than open to better suggestions.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965502#comment-13965502
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

Simple solution: leave DK and RP where they are. :)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-10 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965562#comment-13965562
 ] 

Benedict commented on CASSANDRA-6694:
-

Well, the only fly in that ointment is that they have Buffer and Native 
implementations also, and the DataAllocator allocates them as well as cells. 
So to separate them seems a bit strange - but I'm not too fussed tbh.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-09 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963907#comment-13963907
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


bq. I'm saying performance critical code is impacted when you have virtual 
method calls that cannot be optimised by the VM (i.e. those with multiple 
implementations). I meant CASSANDRA-6553 and CASSANDRA-6934

Which means that if we actually optimize AbstractType and derivatives to work 
directly with underlying bytes whole problem could be resolved? That's why I 
want to understand why we can't have a simple implementation of the cell which 
has one buffer + metadata about component sizes (which could also be encoded) 
instead of having buffer per component in the name (if composite) + buffer for 
value + long timestamp? Maybe it would be easier to offload all of the work to 
AbstractType instead of trying to optimize on the Cell level?

I went through JVM instruction set doc (specifically 
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.invokespecial
 and 
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.invokestatic)
 those methods are not that different and both have to do lookup in the 
constant_pool of that class so I'm wondering if it's virtual calls that create 
a problem or it's something else masked by that...

It also looks like if we use static Impl scheme (like in the patch set) would 
execute the same amount of instructions because compiler emits *aload_0* (this) 
in both cases before would it be invoke\{special, virtual\} or invokestatic, 
and more instructions in static Impl form if we use something else instead of 
this. Generally when callers use methods from super class or interface (as it 
is right now for e.g. Cell.dataSize()) compiler would emit *aload_0, 
invokevirtual #offset* directly to the Cell method, where with static Impl it 
has to that multiple times *aload_0, invokevirtual #offset* (to the method in 
DeleteCell.dataSize() and then internally *aload_0, invokestatic #offset* (to 
the DeletedCell.Impl.dataSize()) which means longer constant_pool walk.


bq. Then what exactly do we win? We still have to have two hierarchies and the 
same modularization. Also the potential ease of optimizations for comparison 
disappear, and we still have increased indirection and virtual method call 
costs. If this is the suggestion, I am very -1, as the payoff is very small, 
the work nontrivial and the negatives substantial.

The wins are, primarily, less object overhead (ultimate goal of all this) and 
maintainability of the code. We basically have Cell based on type - expired, 
deleted, counter, client (the last one being used mostly by Thrift) as it is 
right now, so no Buffered* or Native* plus allocators of 3 types (maybe we 
actually don't need one which allocates DirectBuffer but can just go with JNA 
backed one) which allocate raw bytes. Cell reconcile, equals, dataSize and 
other methods become straight-forward. Also, as we consider Composite as a 
complete entity, storing components as contiguous blocks would reduce container 
overhead + speeds up comparisons by exploiting spatial locality. 

[~jbellis] mentioned this My preferred solution would be, stop extracting the 
name so often by itself. Spot checking the code, it seems we usually do this 
just to simplify a comparison, so this could in principle just be done with 
the Cell object rather than just the name. I think that would would further 
benefit the approach that I'm describing.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-09 Thread Benedict (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963923#comment-13963923
]

Benedict commented on CASSANDRA-6694:
-

bq. less object overhead

There is no reduced overhead from the current patch.

bq. Also, as we consider Composite as a complete entity, storing components as
contiguous blocks would reduce container overhead + speeds up comparisons by
exploiting spatial locality

You seem to be backtracking to the prior suggestion of only one implementation.
I am potentially ok with this, but see my prior comment for concerns and
complications. The -1 was to having what we have now except with an extra level
of indirection (i.e. one packed Cell implementation, and one componentised like
we had before this patch). Also, I would prefer to avoid the extra indirection
+virtual method costs of having another inner object representation, within
which we need another offset.

The JVM instruction set is besides the point. The point is what hotspot will
do: with a single implementor or static method of small enough bytecode
representation, it will be inlined. Note I said multiple implementation
virtual method. With the option you suggest we will need an extra virtual
invocation cost with every access to the underlying bytes, some extra math to
access the right location, and one extra object field reference to locate the
position we're offsetting from. These costs mount up rapidly.

Hmm. No, I now note your client implementation: what exactly is this one?
Please clarify, as the thrift cell is going to need to be compared with the
other implementations, and suddenly much of any benefit will disappear. The
best way to make comparisons cheap and easy is to have both sides of the
comparison have at least the same layout. If we have to either virtual invoke
or instanceof check for every comparison, and a different code path for
comparing each type of representation, there will be a performance impact. As
such the only main benefit of this approach is eliminated in my eyes. Also, how
will this client implementation achieve its various functions, and define its
type? Seems like you'll need a duplicate hierarchy still.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-09 Thread Pavel Yaskevich (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13964564#comment-13964564
]

Pavel Yaskevich commented on CASSANDRA-6694:

bq. The JVM instruction set is besides the point. The point is what hotspot
will do: with a single implementor or static method of small enough bytecode
representation, it will be inlined. Note I said multiple implementation
virtual method. With the option you suggest we will need an extra virtual
invocation cost with every access to the underlying bytes, some extra math to
access the right location, and one extra object field reference to locate the
position we're offsetting from. These costs mount up rapidly.

How is that besides the point when you claim that method calls with multiple
implementations are slower than (and not getting inlined) static method
invocations from multiple classes basically constant_pool reimplementation in
your code?... What I claim is that it doesn't matter if you override a method
multiple times or call a static method which calls another static method like
your patch does for DeletedCell e.g. \{Native,
Buffer\}DeletedCell.cellDataSize() which calls
DeletedCell.Impl.cellDataSize(this) which transfers to
Cell.Impl.cellDataSize(this); Just make an example disassemble classes (with
javap -c or similar) and see what bytecode did it generate. Also for inlining
problem I would like to see the proof of reason why are those methods are not
getting inlined (are they even touched by JIT?) by enabling logging with
-XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining and
sharing the output, otherwise multiple implementation virtual method being
slow claim is just empty rhetoric.

bq. Hmm. No, I now note your client implementation: what exactly is this one?
Please clarify, as the thrift cell is going to need to be compared with the
other implementations, and suddenly much of any benefit will disappear. The
best way to make comparisons cheap and easy is to have both sides of the
comparison have at least the same layout. If we have to either virtual invoke
or instanceof check for every comparison, and a different code path for
comparing each type of representation, there will be a performance impact. As
such the only main benefit of this approach is eliminated in my eyes. Also, how
will this client implementation achieve its various functions, and define its
type? Seems like you'll need a duplicate hierarchy still.

What was just a suggestion for temp container in between client transport and
memtable, as those buffers are already allocated separately by thrift it seems
reasonable to have Cell work with those buffers, it would take more memory for
ByteBuffer containers passed from Thrift but cell comparison logic should not
change because as they would operate on the common container type, it's similar
contept to what Netty does with ByteBuf gathered from other ByteBuf pieces.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962658#comment-13962658
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. Can we decide if we actually want to have Cell (and derivatives) as this 
patch set proposes (with static Impl static classes which is OOP unfriendly to 
say the least) or do something else (question raised back in CASSANDRA-6689)?

Can we have something more concrete than something else as a suggestion?

bq. Is it essential to move everything to the separate package .data ?

No refactoring is essential - however it is much cleaner given all of the new 
classes.

bq. Maybe there is a way which allows us to still have key/value/timestamp as 
fields, so we should only change callers method/class signatures instead? In 
general the idea would be to keep a single implementation of the Cell and add a 
generic placeholder instead of ByteBuffer.

This seems to miss the entire purpose of this patch, which is to reduce the 
heap consumption of each Cell. If we use another placeholder, we will no doubt 
only *increase* the memory consumption, not decrease it; or, at best, reduce it 
only fractionally for off-heap and increase it for on-heap implementations. 
Neither are really acceptable, and would make this whole patch a bit worthless.


 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962679#comment-13962679
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


If I had something more concrete you would see a patch for it, but here I am 
trying to start a discussion, I think [~jbellis] mentioned that it might be 
better to reduce usage of the column names instead of merging cell with column 
name (if I remember correctly). Regarding the moving stuff around - if it's not 
essential then we can do it at the very last stage once we done with all more 
important changes which are plenty. Regarding placeholders idea, if we allocate 
contiguous region for the whole cell we can just have memory object + 1 int (or 
was it even short?...) field which marks the end of the column name at that 
buffer, as column timestamp is a fixed size long we know exactly where column 
value ends, that also helps with spatial locality in most of the cases.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962691#comment-13962691
 ] 

Benedict commented on CASSANDRA-6694:
-

bq. I think Jonathan Ellis mentioned that it might be better to reduce usage of 
the column names instead of merging cell with column name (if I remember 
correctly)

I don't recall this suggestion. Perhaps you are referring to the suggestion 
that we not extract the cell names from the cell as often as we do, for the 
purpose of comparison, in order to reduce garbage production?

bq. Regarding placeholders idea, if we allocate contiguous region for the whole 
cell we can just have memory object + 1 int (or was it even short?...) field 
which marks the end of the column name at that buffer, as column timestamp is a 
fixed size long we know exactly where column value ends, that also helps with 
spatial locality in most of the 

In this case, this suggestion has much more complex problems:

# More (multiple implementation) virtual method invocations (as shown by 
CASSANDRA-6993 this can have meaningfully negative performance implications)
# Major refactor of AbstractType hierarchy to prevent bytebuffer allocation on 
comparison
# More object allocation in the request threads due to having to re-pack all of 
any parameters into a Cell with a single buffer, as opposed to just dropping 
them in place
# At which point it would make most sense to refactor (and mostly eliminate) 
the entirety of CASSANDRA-5417, as we're almost always pumping the result 
straight into a Cell anyway, so extracting the components into separate buffers 
and repacking them into a single buffer in the Cell is very wasteful

That said, it is *viable*. It has some advantages too: the comparisons between 
Native and Buffer cells are much more easily optimised. Many of these changes 
may well need to happen in the natural course of things anyway as we optimise 
the native implementation. But it has comparatively wide-ranging implications 
for the current on-heap use case that might be a bit too much to bite off right 
now.

bq. if it's not essential then we can do it at the very last stage once we done 
with all more important changes which are plenty

I disagree. It makes the patch more complicated to *not* move it around. 
Because something is not essential does not mean it is not the better option

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963063#comment-13963063
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

bq. It makes the patch more complicated to not move it around.

I thought Pavel was referring to moving existing classes into new packages, but 
I may be mistaken.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963100#comment-13963100
 ] 

Benedict commented on CASSANDRA-6694:
-

Yes, but I've gutted those classes into interfaces and introduced new sister 
classes. And modelling those (in my head, at least) is very difficult when it's 
not easy to see what classes relate to each other at a glance, and the db 
package is overpopulated as it is. So anything I need to do to make it possible 
to think about whilst writing it, I assume is going to be helpful for anybody 
else reading it.

But if you both want to change that bit, I can rebase again. Since I've written 
it, it doesn't matter so much to me now.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-08 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963531#comment-13963531
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


bq. More (multiple implementation) virtual method invocations (as shown by 
CASSANDRA-6993 this can have meaningfully negative performance implications)

I'm getting mixed signals here, are you claiming that JVM does a bad job or OOP 
is broken in general? Also CASSANDRA-6993 seems to point to a different problem.

bq. Major refactor of AbstractType hierarchy to prevent bytebuffer allocation 
on comparison

I don't see a problem with this if it spares most of the changes in allocators 
and Cell*/DecoratedKey rewrites.

bq. More object allocation in the request threads due to having to re-pack all 
of any parameters into a Cell with a single buffer, as opposed to just dropping 
them in place

We can have a Cell separate implementation with multiple buffers as Thrift 
allocates them anyway which we are going to be transformed to linear ones once 
they get into memtable as we have to reallocate there.

bq. At which point it would make most sense to refactor (and mostly eliminate) 
the entirety of CASSANDRA-5417, as we're almost always pumping the result 
straight into a Cell anyway, so extracting the components into separate buffers 
and repacking them into a single buffer in the Cell is very wasteful

Is it good or bad to refactor it? [~slebresne]/[~jbellis] WDYT?



 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963619#comment-13963619
]

Benedict commented on CASSANDRA-6694:
-

bq. I'm getting mixed signals here, are you claiming that JVM does a bad job or
OOP is broken in general? Also CASSANDRA-6993 seems to point to a different
problem.

I'm saying performance critical code is impacted when you have virtual method
calls that cannot be optimised by the VM (i.e. those with multiple
implementations). I meant CASSANDRA-6553 and CASSANDRA-6934

bq. We can have a Cell separate implementation with multiple buffers as Thrift
allocates them anyway which we are going to be transformed to linear ones once
they get into memtable as we have to reallocate there.

Then what exactly do we win? We still have to have two hierarchies and the same
modularisation. Also the potential ease of optimisations for comparison
disappear, and we still have increased indirection and virtual method call
costs. If this is the suggestion, I am very -1, as the payoff is very small,
the work nontrivial and the negatives substantial.

Slightly More Off-Heap Memtables

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-07 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962580#comment-13962580
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


I looked through the code and here is the couple of questions that I have/had:

# Can we decide if we actually want to have Cell (and derivatives) as this 
patch set proposes (with static Impl static classes which is OOP unfriendly to 
say the least) or do something else (question raised back in CASSANDRA-6689)?
# Is it essential to move everything to the separate package .data ?
# Do we want to have a number of pool/allocator implementations which allocate 
different type of objects or is it possible to make a generic container (for 
ByteBuffer/Memory) which would basically be a pointer to a bigger buffer that 
holds all of the components (name/value/timestamp) so we can have limited 
number of allocators/pools to maintain ([~jbellis] described the same vision in 
one of his previous comments)... Maybe there is a way which allows us to still 
have key/value/timestamp as fields, so we should only change callers 
method/class signatures instead? In general the idea would be to keep a single 
implementation of the Cell and add a generic placeholder instead of ByteBuffer.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959841#comment-13959841
 ] 

Benedict commented on CASSANDRA-6694:
-

rebased and pushed -f

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960112#comment-13960112
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

Is there a case to be made here that there's more abstraction than necessary?  
Because I'm still having trouble wrapping my head around it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960117#comment-13960117
 ] 

Benedict commented on CASSANDRA-6694:
-

Well, it's probably indicative of something wrong, but I don't think it's the 
level of abstraction. Probably I can re-organise it to make it clearer, though.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960187#comment-13960187
 ] 

Benedict commented on CASSANDRA-6694:
-

Rebased, reorganised and pushed to 
[6694-reorg|https://github.com/belliottsmith/cassandra/tree/6694-reorg]

Does that make it clearer what's going on?

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables