[ https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190740#comment-13190740 ]
Pavel Yaskevich commented on CASSANDRA-2392: -------------------------------------------- here is the last things with v3 - {load, save}Summaries methods are leaking file descriptors because {o, i}Stream is closed only when method handles IOException. Nit: {code} + FileInputStream input = new FileInputStream(inMemoryDataFile); + iStream = new DataInputStream(input); {code} and {code} + FileOutputStream input = new FileOutputStream(summaryFile); + oStream = new DataOutputStream(input); {code} can be changed to {noformat} {i,o}Stream = new Data{Input, Output}Stream(new File{Input, Output}Stream(summaryFile); {noformat} because input var is not really needed. I don't think that "0001-re-factor-first-and-last" is a good idea because by moving first/last variables to IndexSummary you change their semantics and they are no longer indicate the first and last key that SSTable keeps but rather first/last key covered by IndexSummary of the individual SSTable, so I think we really should just keep those variables in the old place. Also I'm concerned that CASSANDRA-3762 is marked for 1.2 and this one for 1.1 because if we don't get them in one release that could make start-up times even longer than right now, which breaks the point of current task, because there is big chance that key cache would be enabled on the big ColumnFamilies. > Saving IndexSummaries to disk > ----------------------------- > > Key: CASSANDRA-2392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2392 > Project: Cassandra > Issue Type: Improvement > Reporter: Chris Goffinet > Assignee: Vijay > Priority: Minor > Fix For: 1.1 > > Attachments: 0001-re-factor-first-and-last.patch, > 0001-save-summaries-to-disk.patch, 0002-save-summaries-to-disk-v2.patch, > 0002-save-summaries-to-disk-v3.patch, 0002-save-summaries-to-disk.patch > > > For nodes with millions of keys, doing rolling restarts that take over 10 > minutes per node can be painful if you have 100 node cluster. All of our time > is spent on doing index summary computations on startup. It would be great if > we could save those to disk as well. Our indexes are quite large. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira