[jira] [Commented] (CASSANDRA-4023) Batch reading BloomFilters on startup

Michael Harris (Commented) (JIRA) Thu, 08 Mar 2012 18:02:22 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225777#comment-13225777
 ]


Michael Harris commented on CASSANDRA-4023:
-------------------------------------------

My $0.02 is that it may be helpful to batch reads.  Not sure if the underlying 
stream used in reading the bloom filters reads a large chunk and caches it, but 
if not, it could help to instead of just calling ois.readLong(), you read 64K 
or 1M or whatever you feel is appropriate (maybe configurable?) into a buffer 
and grab the longs out of those.  This doesn't completely fix the problem of 
disk contention, but it might cause larger sequential reads to be submitted to 
the disk, which then might behave nicer?
                
> Batch reading BloomFilters on startup
> -------------------------------------
>
>                 Key: CASSANDRA-4023
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4023
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Joaquin Casares
>              Labels: datastax_qa
>
> The difference of startup times between a 0.8.7 cluster and 1.0.7 cluster 
> with the same amount of data is 4x greater in 1.0.7.
> It seems as though 1.0.7 loads the BloomFilter through a series of reading 
> longs out in a multithreaded process while 0.8.7 reads the entire object.
> Perhaps we should update the new BloomFilter to do reading in batch as well?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4023) Batch reading BloomFilters on startup

Reply via email to