[ https://issues.apache.org/jira/browse/CASSANDRA-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benedict reopened CASSANDRA-8312: --------------------------------- It looks to me like this doesn't tidy up after itself properly, at least on trunk. It opens an sstable from the snapshot if necessary, references it, and then releases only the reference it acquired - not the extra reference that would permit its BF etc. to be reclaimed. So this will likely leak significant amounts of memory. > Use live sstables in snapshot repair if possible > ------------------------------------------------ > > Key: CASSANDRA-8312 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8312 > Project: Cassandra > Issue Type: Improvement > Reporter: Jimmy Mårdell > Assignee: Jimmy Mårdell > Priority: Minor > Fix For: 2.0.12, 3.0, 2.1.3 > > Attachments: cassandra-2.0-8312-1.txt > > > Snapshot repair can be very much slower than parallel repairs because of the > overhead of opening the SSTables in the snapshot. This is particular true > when using LCS, as you typically have many smaller SSTables then. > I compared parallel and sequential repair on a small range on one of our > clusters (2*3 replicas). With parallel repair, this took 22 seconds. With > sequential repair (default in 2.0), the same range took 330 seconds! This is > an overhead of 330-22*6 = 198 seconds, just opening SSTables (there were > 1000+ sstables). Also, opening 1000 sstables for many smaller rangers surely > causes lots of memory churning. > The idea would be to list the sstables in the snapshot, but use the > corresponding sstables in the live set if it's still available. For almost > all sstables, the original one should still exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)