[ 
https://issues.apache.org/jira/browse/CASSANDRA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-4023:
--------------------------------------

    Attachment: cassandra-1.0-4023-v2.txt

I measured how long does it take to load each sstable component(Data, Index, 
Filter) and found out that loading from index file takes longer in 1.0 than in 
0.8.
By looking code for difference between 0.8 and 1.0, I noticed that in 1.0, 
every keys stored in index file get deserialized, while  in 0.8, only those 
keys that should be added to index summary get deserialized.
The reason we deserialize all keys in 1.0 is to obtain first and last keys 
stored in sstable. Attached patch tries to skip deserializing keys when 
possible.

Patch is against 1.0 branch.
                
> Improve BloomFilter deserialization performance
> -----------------------------------------------
>
>                 Key: CASSANDRA-4023
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4023
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.1
>            Reporter: Joaquin Casares
>            Assignee: Yuki Morishita
>            Priority: Minor
>              Labels: datastax_qa
>             Fix For: 1.0.9, 1.1.0
>
>         Attachments: 4023.txt, cassandra-1.0-4023-v2.txt
>
>
> The difference of startup times between a 0.8.7 cluster and 1.0.7 cluster 
> with the same amount of data is 4x greater in 1.0.7.
> It seems as though 1.0.7 loads the BloomFilter through a series of reading 
> longs out in a multithreaded process while 0.8.7 reads the entire object.
> Perhaps we should update the new BloomFilter to do reading in batch as well?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to