[ https://issues.apache.org/jira/browse/CASSANDRA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225777#comment-13225777 ]
Michael Harris commented on CASSANDRA-4023: ------------------------------------------- My $0.02 is that it may be helpful to batch reads. Not sure if the underlying stream used in reading the bloom filters reads a large chunk and caches it, but if not, it could help to instead of just calling ois.readLong(), you read 64K or 1M or whatever you feel is appropriate (maybe configurable?) into a buffer and grab the longs out of those. This doesn't completely fix the problem of disk contention, but it might cause larger sequential reads to be submitted to the disk, which then might behave nicer? > Batch reading BloomFilters on startup > ------------------------------------- > > Key: CASSANDRA-4023 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4023 > Project: Cassandra > Issue Type: Improvement > Reporter: Joaquin Casares > Labels: datastax_qa > > The difference of startup times between a 0.8.7 cluster and 1.0.7 cluster > with the same amount of data is 4x greater in 1.0.7. > It seems as though 1.0.7 loads the BloomFilter through a series of reading > longs out in a multithreaded process while 0.8.7 reads the entire object. > Perhaps we should update the new BloomFilter to do reading in batch as well? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira