[jira] [Comment Edited] (CASSANDRA-5521) move IndexSummary off heap
[ https://issues.apache.org/jira/browse/CASSANDRA-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647664#comment-13647664 ] Jonathan Ellis edited comment on CASSANDRA-5521 at 5/2/13 4:28 PM: --- bq. Changing the Partitioner is a bigger change It does get ugly since you'd need to reimplement Murmur3.hash3_x64_128 on Memory objects. (Not for the first time, I'm pissed that ByteBuffer isn't an interface...) Let's go ahead and move forward with v2 and optimize later if we need to. Nits: # rename hasSummaries to offHeapSummaries # InputStream.read has an overload that takes a length parameter, you don't need to realloc the buffer # The comment doesn't match the code here. Also, getIndex should be private. {code} . // multiply by 4 and add the block start return bytes.getInt(index 2); {code} # We can easily inline DK.compareTo instead of actually creating a DK object (i.e., call partitioner.getToken instead, then compare the tokens and keys without the DK wrapper) The rest LGTM. was (Author: jbellis): bq. Changing the Partitioner is a bigger change It does get ugly since you'd need to reimplement Murmur3.hash3_x64_128 on Memory objects. (Not for the first time, I'm pissed that ByteBuffer isn't an interface...) Let's go ahead and move forward with v2 and optimize later if we need to. Nits: # rename hasSummaries to offHeapSummaries # InputStream.read has an overload that takes a length parameter, you don't need to realloc the buffer # The comment doesn't match the code here. Also, getIndex should be private. {code} . // multiply by 4 and add the block start return bytes.getInt(index 2); {code} The rest LGTM. move IndexSummary off heap -- Key: CASSANDRA-5521 URL: https://issues.apache.org/jira/browse/CASSANDRA-5521 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 2.0 IndexSummary can still use a lot of heap for narrow-row sstables. (It can also contribute to memory fragmentation because of the large arrays it creates.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5521) move IndexSummary off heap
[ https://issues.apache.org/jira/browse/CASSANDRA-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647264#comment-13647264 ] Vijay edited comment on CASSANDRA-5521 at 5/2/13 4:09 AM: -- Honestly, glad to see the thread going in the same thinking process which i went though Changing the Partitioner is a bigger change... but before we go there, wondering if this optimization is going to help us? For BB is not cheap, but it is going to be good garbage which will live and die in young generation. I can think of 2 other options... 1) We can serialize and deserialize Token in IndexSummary we still need additional function to serialize and deserialize from memory (for BOP we can serialize the key/byte[], we have also removed the token calculation overhead) so we can also try and compare incrementally. 2) We can use MMappedFile instead and get ByteBuffer (this could work in our favor, for the new SST's which is never queried there is zero overhead in memory ) :) was (Author: vijay2...@yahoo.com): Honestly, glad to see the thread going in the same thinking process which i went though Changing the Partitioner is a bigger change... before we go there, wondering if this optimization is going to help us? For BB is not cheap, but it is going to be good garbage which will live and die in young generation. I can think of 2 other options... 1) We can serialize and deserialize Token in IndexSummary for RP (for BOP we can serialize the key/byte[]) so we can compare incrementally too (taking the hit during flush) 2) We can use MMappedFile instead and get ByteBuffer (this could work in our favor, for the new SST's which is never queried there is zero overhead in memory ) :) move IndexSummary off heap -- Key: CASSANDRA-5521 URL: https://issues.apache.org/jira/browse/CASSANDRA-5521 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 2.0 IndexSummary can still use a lot of heap for narrow-row sstables. (It can also contribute to memory fragmentation because of the large arrays it creates.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5521) move IndexSummary off heap
[ https://issues.apache.org/jira/browse/CASSANDRA-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646352#comment-13646352 ] Pavel Yaskevich edited comment on CASSANDRA-5521 at 5/1/13 5:55 AM: Have we considered using vint encoding on those arrays as we keep them in memory anyway to minimize space consumption? Edit: i remember now why that is not a good idea :) I wonder though how what could memory footprint be if we use TreeMap inside and keys and offsets (in vint encoding) saved in native memory... was (Author: xedin): Have we considered using vint encoding on those arrays as we keep them in memory anyway to minimize space consumption? move IndexSummary off heap -- Key: CASSANDRA-5521 URL: https://issues.apache.org/jira/browse/CASSANDRA-5521 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 2.0 IndexSummary can still use a lot of heap for narrow-row sstables. (It can also contribute to memory fragmentation because of the large arrays it creates.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira