[ https://issues.apache.org/jira/browse/CASSANDRA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004206#comment-13004206 ]
Jonathan Ellis commented on CASSANDRA-2191: ------------------------------------------- I would prefer simply trying to solve reporting for CM/CE; we only have information on those tasks because we have ICompactionInfo to tell us about it. I don't think we want to turn this into "add some kind of progress reporting for generic Runnables." > Multithread across compaction buckets > ------------------------------------- > > Key: CASSANDRA-2191 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2191 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Stu Hood > Priority: Critical > Labels: compaction > Fix For: 0.8 > > Attachments: 0001-Add-a-compacting-set-to-sstabletracker.txt, > 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt > > > This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and > reasoning are different enough to open a separate issue. > The problem with compactions currently is that they compact the set of > sstables that existed the moment the compaction started. This means that for > longer running compactions (even when running as fast as possible on the > hardware), a very large number of new sstables might be created in the > meantime. We have observed this proliferation of sstables killing performance > during major/high-bucketed compactions. > One approach would be to pause compactions in upper buckets (containing > larger files) when compactions in lower buckets become possible. While this > would likely solve the problem with read performance, it does not actually > help us perform compaction any faster, which is a reasonable requirement for > other situations. > Instead, we need to be able to perform any compactions that are currently > required in parallel, independent of what bucket they might be in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira