[ 
https://issues.apache.org/jira/browse/CASSANDRA-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192654#comment-13192654
 ] 

Pavel Yaskevich commented on CASSANDRA-3762:
--------------------------------------------

I mention this because the problem in the original ticket was with rolling 
restarts taking too much time on index summary computation (read going though 
whole PrimaryIndex for every SSTable out there), so imagine situation when you 
have few hundreds of SSTables each with key cache in the different parts of the 
primary index this means if you go with getPosition() calls you will have a lot 
of random I/O (meaning you will have to seek deeper and deeper into the primary 
index file which means slower data access even in mmap mode) on each of those 
and I'm not sure if it's really better than reading primary index sequentially 
especially knowing that you have already read all of the index/data positions 
from the Summary component. I propose you do the test with many SSTables and 
compare system load times (don't forget to drop page cache between tests with 
`sync; echo 3 > /proc/sys/vm/drop_caches`).

By the way, I forgot to ask you if you dropped page cache before running second 
test? if you didn't that would pretty much explain such a dramatic improvement 
in the load time...
                
> AutoSaving KeyCache and System load time improvements.
> ------------------------------------------------------
>
>                 Key: CASSANDRA-3762
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3762
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 0001-SavedKeyCache-load-time-improvements.patch
>
>
> CASSANDRA-2392 saves the index summary to the disk... but when we have saved 
> cache we will still scan through the index to get the data out.
> We might be able to separate this from SSTR.load and let it load the index 
> summary, once all the SST's are loaded we might be able to check the 
> bloomfilter and do a random IO on fewer Index's to populate the KeyCache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to