[jira] [Comment Edited] (CASSANDRA-5521) move IndexSummary off heap

2013-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647664#comment-13647664
 ] 

Jonathan Ellis edited comment on CASSANDRA-5521 at 5/2/13 4:28 PM:
---

bq. Changing the Partitioner is a bigger change

It does get ugly since you'd need to reimplement Murmur3.hash3_x64_128 on 
Memory objects.  (Not for the first time, I'm pissed that ByteBuffer isn't an 
interface...)

Let's go ahead and move forward with v2 and optimize later if we need to.

Nits:
# rename hasSummaries to offHeapSummaries
# InputStream.read has an overload that takes a length parameter, you don't 
need to realloc the buffer
# The comment doesn't match the code here.  Also, getIndex should be private.
{code}
.   // multiply by 4 and add the block start
return bytes.getInt(index  2);
{code}
# We can easily inline DK.compareTo instead of actually creating a DK object 
(i.e., call partitioner.getToken instead, then compare the tokens and keys 
without the DK wrapper)

The rest LGTM.


  was (Author: jbellis):
bq. Changing the Partitioner is a bigger change

It does get ugly since you'd need to reimplement Murmur3.hash3_x64_128 on 
Memory objects.  (Not for the first time, I'm pissed that ByteBuffer isn't an 
interface...)

Let's go ahead and move forward with v2 and optimize later if we need to.

Nits:
# rename hasSummaries to offHeapSummaries
# InputStream.read has an overload that takes a length parameter, you don't 
need to realloc the buffer
# The comment doesn't match the code here.  Also, getIndex should be private.
{code}
.   // multiply by 4 and add the block start
return bytes.getInt(index  2);
{code}

The rest LGTM.

  
 move IndexSummary off heap
 --

 Key: CASSANDRA-5521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5521
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 2.0


 IndexSummary can still use a lot of heap for narrow-row sstables.  (It can 
 also contribute to memory fragmentation because of the large arrays it 
 creates.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5521) move IndexSummary off heap

2013-05-01 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647264#comment-13647264
 ] 

Vijay edited comment on CASSANDRA-5521 at 5/2/13 4:09 AM:
--

Honestly, glad to see the thread going in the same thinking process which i 
went though 

Changing the Partitioner is a bigger change... but before we go there, 
wondering if this optimization is going to help us? 
For BB is not cheap, but it is going to be good garbage which will live and die 
in young generation.

I can think of 2 other options...
1) We can serialize and deserialize Token in IndexSummary we still need 
additional function to serialize and deserialize from memory (for BOP we can 
serialize the key/byte[], we have also removed the token calculation overhead) 
so we can also try and compare incrementally.
2) We can use MMappedFile instead and get ByteBuffer (this could work in our 
favor, for the new SST's which is never queried there is zero overhead in 
memory ) :)

  was (Author: vijay2...@yahoo.com):
Honestly, glad to see the thread going in the same thinking process which i 
went though 

Changing the Partitioner is a bigger change... before we go there, wondering if 
this optimization is going to help us? 
For BB is not cheap, but it is going to be good garbage which will live and die 
in young generation.

I can think of 2 other options...
1) We can serialize and deserialize Token in IndexSummary for RP (for BOP we 
can serialize the key/byte[]) so we can compare incrementally too (taking the 
hit during flush)
2) We can use MMappedFile instead and get ByteBuffer (this could work in our 
favor, for the new SST's which is never queried there is zero overhead in 
memory ) :)
  
 move IndexSummary off heap
 --

 Key: CASSANDRA-5521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5521
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 2.0


 IndexSummary can still use a lot of heap for narrow-row sstables.  (It can 
 also contribute to memory fragmentation because of the large arrays it 
 creates.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5521) move IndexSummary off heap

2013-04-30 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646352#comment-13646352
 ] 

Pavel Yaskevich edited comment on CASSANDRA-5521 at 5/1/13 5:55 AM:


Have we considered using vint encoding on those arrays as we keep them in 
memory anyway to minimize space consumption?

Edit: i remember now why that is not a good idea :) I wonder though how what 
could memory footprint be if we use TreeMap inside and keys and offsets (in 
vint encoding) saved in native memory...

  was (Author: xedin):
Have we considered using vint encoding on those arrays as we keep them in 
memory anyway to minimize space consumption?
  
 move IndexSummary off heap
 --

 Key: CASSANDRA-5521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5521
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 2.0


 IndexSummary can still use a lot of heap for narrow-row sstables.  (It can 
 also contribute to memory fragmentation because of the large arrays it 
 creates.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira